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When we started publication of the Mathematics Magazine we under- 
took to have our research articles judged by our readers, as 1s done in 
some other fields. However usage seems to have developed the feeling 
that when an article in mathematics has been published it is no longer 
a subject for critical analysis. In fact we have received only one 
negative criticism and it was neither analytical not constructive. 
Careful reading of the large number of papers we receive quickly 
became impossible for the editors. Hence we have asked and received 
the valuable services of the following referees, for which we are 
very grateful: 

E. F. Beckenbach, Clifford Bell, Brockway McMillan, Herbert Busemann, 
Bernard Friedman, Alford Horn, D. H. Lehmer, Norman Levinson, R. S. 
Phillips, J. F. Randolph, FE. Snapper, I. S. Sokolnikoff, FE. G. Straus, 
J. Dean Swift, A. F. Taylor, W. R. Wasow, and A. L. Whiteman. 
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ELEMENTS OF A MATHEMATICAL 
THEORY OF PROBABILITY 


J. H. Curtiss 


1. Introduction. It seems to be a characteristic of maturity in 
human thought, whether it be concerned with the sciences or with 
everyday affairs, ultimately to recognize that everything is subject 
to variation and change. The realization finally dawns that conclusions 
are never final and predictions are never sure, however much they may 
be supported by contemporary evidence; and that often it is best to 
state them in a form which emphasizes —- even quantifies - their un- 
certainty. 

This pattern seems to be deeply bound up with certain principles of 
human behavior. Primitive or ignorant men, when confronted with poorly 
understood, uncontrollable situations, turn in their anxiety to ritual, 
to authority, to revelation. This stage of thinking may eventually be 
accompanied (or even replaced) by a rationalism of high intellectual 
order. But there is little room for organized concepts of chance here. 

As factual knowledge grows in the community, and with it, control 
over environment, the authoritarian point of view tends to be replaced 
by the empirical approach. The trouble is that empiricism, honestly 
and studiously pursued, keeps running into baffling problems of change 
and chance. At first, it may be neither wise nor convenient to give 
open recognition to such problems, but sooner or later they simply 
must be faced. 

All this can be illustrated in many ways; for example, by reciting 
the history of witchcraft, or of religion, or of various special 
sciences, such as agriculture or weather forecasting. There is space 
here only to look at a very broad instance, and then to touch on some 


typical special cases. 

The broad instance is that of the philosophy of the natural sciences 
in the western world. Omitting the earliest beginnings in superstition 
and theology, we start with the classical concept cf the natural science; 
which goes back to the Greeks and which attained towering achievements 
in the seventeenth and eighteenth centuries. 1t was that of a body 
of unique, clearly defined, unchanging laws. Human frailty being what 
it is, it was possible to apprehend the system only little by little; 
but it was supposed to exist in a complete and perfect state just the 
same. It would presently become clear by the exercise of the processes 
of reasoning alone, just as a field of mathematics is clarified. Experi- 
mental data could sometimes be an interesting guide, but for drawing 
ultimate conclusions they were untrustworthy. The important truths were 
self-evident, or if not, would become so as the thinking evolved. 

Euclidean geometry was in fact considered the model branch of science 
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within this classic conception. When the non-Euclidean geometries were 
discovered in the nineteenth century, and then were actually employed 
in theoretical physics, a heavy blow was given to the rationalistic 
tradition. Many other factors influenced the swing away from rationalism; 
the story is far too complicated even to sketch here. But the fact 
1s that empiricism has now largely replaced the classic tradition, 
and with it has finally come rather full recognition of the importance 
of chance and change. The modern scientist freely admits that even 
his most strongly supported beliefs and theories have a contingent 
character. His reliance is now primarily on the adequacy of methods 
rather than on the infallibility and uniqueness of conclusions.* 

In a scientific world dominated by number and measurement, even such 
an elusive concept as uncertainty seems to demand a scale of measurement, 
and a clearly understandable definition, and perhaps even a theory. 
Actually the crude elements of a quantitative approach to probability 
have been present for a long time, and there was a substantial develop- 
ment of the theory long before the rationalistic doctrines in the 
philosophy of science were thrown out. Nagel [13] discerns certain 
statistical notions in the writings of Aristotle on biology. Cardan 
in the sixteenth century, Galileo, Pascal, Fermat, and Huygens in the 
seventeenth century, the Bernoullis, De Moivre, and Bayes in the 
eighteenth, all solved special problems involving quantitatively measured 
probabilities. 

Their work was brought to a climax in Laplace’ s Theorie analytique 
des probabilities (1812). This astonishingly detailed and complete 
treatise was accompanied by an influential popular exposition, Essai 
philosophique sur les probabilities. Laplace’s thinking dominated the 
field for a hundred years, and it was only in the early part of the 
twentieth century that substantial advances began to be made again in 
probability theory. 

The motivation for the seventeenth century work in probability theory 
was largely frivolous, but during the eighteenth century attempts were 
made to find probabilistic answers to serious questions arising in 
the theory of observations in public administration, in judicial pro- 
cedure, and in theoretical physics. These attempts culminated in a 
string of brilliant successes in the nineteenth century; among them 
are the Maxwell theory of gases, Boltzman’s work; and the Mendelian 
theory in genetics. The formlation of theories and results in quantita- 
tive probabilistic terms has now become a familiar phenomenon in 
physics, chemistry, biology, and many other branches of science. 


2. The problem of the definition. Just what is probability? 4almos, 
a mathematician, says firmly [7], “Probability is a branch of mathe- 
matics. It is not a branch of experimental science nor armchair philo- 
sophy, it is neither physics nor logic... . The situation is analogous 


*Nagel [13], pp. 1-4. 
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to that in geanetry.” He then goes on to develop “probability’’, in 
his sense, as a branch of mathematical measure theory - a procedure 
which we follow here when the time comes to turn to mathematics. 

But most philosophers, and many scientists, would say that what 
Halmos was talking about was just a part of probability theory - the 
part sometimes identified nowadays as “mathematical prabability theory”. 
They would consider the mathematician much too ambitious who would 
preempt the entire field of probability for mathematics - that is, 
unless he wishes to view mathematics in a greatly extended sense and 
is willing to take many strange new troubles om his shoulders. The non- 
mathematical problems involved in probability seem more extensive 
and more subtle than even those involved in the physical interpretations 
of geometry and classical analysis. 

Consider the following statements in which the term probability or 
its derivatives appear. It will be noted that they all have quite a 
familiar sound. They will be divided into two groups. 

First group: “It is more probable that it will rain in Los Angeles 
in January than in June’. “The probability that a neutron from a 
radioactive source placed against a certain lead shield will pass 
through that shield is 0.345”. “The probability that a uniform pair 
of dice will sh w a total of seven when tossed is 1/6”. “The probability 
is 0.95 that in random sampling from a normal population with known 
standard deviation 7 and unknown mean u, the interval (x - 1.949, 
x + 1.967) will cover u, where x is the arithmetic mean of the sample 
observations”. “The probability that an American male in a certain 
profession will survive his fiftieth birthday is 9.422”, 

Second Group: “It is improbable that Bacon wrote Shakespeare”. 
“Probably there is life somewhere in the universe in addition to that 
on the earth”. “It is probable that if Napoleon had not had certain 
physical ailments, the entire course of history would be changed”. “It 
is more probable that the Aztecs got to Mexico by way of Alaska than 
by paddling canoes across the Pacific”. ‘“‘Probably this man was murder- 
ed”, “The theory of evolution is more pradbable on the evidence than 
the Biblical theory of creation’’. 

The statements of the first group all have one thing in common: 
the word probability is being used somehow to indicate the intensity 
of me’s expectation that an event will happen. It 1s being used in 
a predictive sense. In the second group, the word seems to have sanething 
to do with the adequacy of evidence relating to events or situations 
which already have occurred. 

It is the opinion of mcst students of probability that a more or 
less satisfactory mathematical interpretation and theory can be adduced 
for the term probability as used in the statements of the first group. 
On the other aand, it does not seem to be at all certain that this 
can be done for the second group*. In any case, with so many implications 


“A discussion will be found in Nagel {13], Section III. 
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to cover, the task of choosing and evaluating a mathematical model - a 
project which belongs on the bridge between mathematics and the physical 
world, rather than in the field of mathematics itself — looks as if 
it might be quite as difficult intellectually as the formal development 
of the model chosen. 

If there is to be a mathematical theory, it is pretty evident that 
it would be helpful, although not necessary, to have some sort of 
precise metrical definition of the word “probability’’ to use inside 
the mathematical theory as a starting point. Unfortunately neither the 
users nor the students of prabability are agreed among themselves as 
to how to do this. Certain philosophers, notably Reichenbach, appear 
to favor the idea of defining probability as the limit (in the classical 
sense of mathematical analysis) of a sequence of relative frequencies 
in an infinite reference class. Von Mises [12] in a series of papers 
which started to appear shortly after World War I, was the first to 
study this definition mathematically. To make the mathematical theory 
useful, he found it necessary to restrict the reference class by 
postulating that certain subsequences of relative frequencies all 
have the same limit. Other mathematicians subsequently have shown that 
this restriction is so severe that the definition is now quite generally 
regarded as unsuccessful in spite of many attempts to fix it up.* 

Others like to go back to Laplace’s formulation of a probability 
as the ratio of the cardinal number of two sets of alternatives, with 
the set corresponding to the denominator including the set carresponding 
to the numerator. (This is the mathematical content of the definition 
usually proposed in college algebra books.) Closely related to this 
is the old definition of “ geometric probability” as the ratio of two 
lengths or two areas, without reference to physical applications. 
Certain physicists propose that probability shall be a measure of 
“partial beliefs’, given certain evidence. There are numerous other 
definitions, too, principally proposed by philosophers and requiring 
the technical language of philosophy to state. 

Put although the semantical and pragmatic problems involved in 
probability have not been solved, nevertheless as implied in Section l, 
probability concepts are successfully being used throughout modern 
science and industry. Then too, by the beginning of the present century, 
quite a large body of mathematical theory had been worked out. The 
literature of these studies cmtained a good deal of attractive mathe- 
matics, and conveyed a promise of more gold to be found with a little 
digging. As the new century wore on, more and more mathematicians 
began taking trips into the gold fields. Under the added stimulus of 
the applications the mathematical literature began to grow by leaps and 
bounds. 

*See Doob (2], and also the discussion which follows this reference in 


the same issue of the Annals of Mathematical Statistics. 
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Now one of the fashions of the day in mathematics is the axiomatic 
approach. For this to operate effectively in a branch of applied mathe- 
matics, 1t should be based on physical phenomena which are simple, 
fundamental, and which have been widely experienced. The present-day 
mathematical probabilists wanted to axiomatize their discipline. What 
intuitive or physical concept of probability should they choose as the 
basis? In view of the philosophical difficulties in probability theory 
touched on above, the decision was not altogether an easy one to make. 

The requirements of simplicity and wide understanding, together with 
the mechanics of many of the applications, dictated the answer. The 
basic intuitive concept which most of the mathematicians now working 
in probability theory use is this: Physically speaking, a probability 
shall be regarded as a stabilized, long-run value of the relative 
frequency of the occurrence of an event in repeated trials (under 
“identical” conditions) of an experiment which can result in this and 
in other events. 

The construction of the mathematical foundations then proceeds 
roughly as follows. A fixed number is arbitrarily selected to represent 
the idealized value of the relative frequency. A calculus of these 
numbers 1s then developed on the basis of a Boolean algebra whose 
elements consist of “events”, which are taken as the completely funda- 
mental, undefined elements of the mathematical theory, just as points 
and straight lines are the undefined elements of Euclidean geometry. 
It turns out that the calculus actually looks like a rather highly 
developed special case of the general theory of distributions of mass - a 
theory which has been studied for a long time in physics and engineering. 

This construction does not attempt at the outset to provide as 
faithful a descriptia of the empirical situation as did that of Von 
Mises. It does not include in its axiomatics, for example, a character- 
ization of “ randomness’’; but the characterization appears later on 
in the guise of certain theorems. Incidentally, the simplicity of the 
axiomatics has the fortunate result that the ensuing calculus can readily 
be used in connection with a number of other physical interpretations 
of probability. One of these is the definition used by Laplace involving 
the ratio of alternatives, and all the classical mathematical work 
in probability theory can therefore be accepted almost without change*. 


*In connection with the axiomatics of mathematical probability, Prof. 
Mark Kac pointed out to the author that one of David Hilbert’s famous 
twenty-three problems included the task of axiomatizing. probability 
theory. The statement of Hilbert’s problem runs as follows 10], p. 306: 

“Durch die Untersuchungen uber die Grundlagen der Geometrie wird uns 
die Aufgabe nahe gelegt, nach diesem Vorbilde die jenigen physikalischen 
Disziplinen axiomatisch zu behandeln, in denen schon heute die Mathematik 
eine hervorragende Rolle spielt: dies sind in erster Linie die Wahr- 
scheinlichkeitsrechnung und die Mechanik. 

“Was die Axiome der Wahrscheinlichkeitsrechnung angeht, so scheint 
es mir wunschenswert, dass mit der logischen Untersuchung derselben 
zugleich eine strenge und befriedigende Entwicklung der Methode der 
mittleren Werte in der mathematischen Physik, speziell in der kinetischen 
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An exposition of the elements of this mathematical theory of proba- 
bility will now be given. As it unfolds, it is hoped that the mathe- 
matical reader, whatever his preconceived notions and prejudices about 
probability may be, will perceive that the mathematical theory forms 
a perfectly respectable branch of mathematics. To be sure, it has its 
own special notation, and its nomenclature has a feature typical of 
all mathematics intended to be applied, in that purely mathematical 
cbjects are sometimes called by names which have distracting physical 
or psychological connotations (e.g. “moment”, “ expectation’’). But for 
all that, it 1s just as rigorous by current mathematical standards as, 
say, modern topology. The areas of doubt and controversy which have 
been pointed out earlier do not lie in the realm of mathematics. Their 
existence only serves warning that however satisfying the theory may 
be from the mathematical viewpoint, it might end up by not being accept- 
ed by the users, at least in its present form. The chances of this look 
rather small at present, but they are not entirely negligible. 


3. The definition of mathematical probability for the discrete case.* 
Our starting point is at the principal undefined notions of the mathe- 
matical theory. First we introduce the concept of a simple event. 
Consider the set of absolutely all thinkable or distinguishable outcomes 
of an experiment or observation. These are the simple events of the 
experiment. They may or may not be naturally characterized by numbers 
through the conditions of the experiment; this has nothing to do with 
the concept. For example, in the toss of a coin, the simple events 
are head and tail. In the experiment consisting of dealing at the card 
game known as bridge, each of the possible deals is a simple event. 
In tossing two dice, a red one and a green one, each of the possible 
throws (such as deuce on the red one and five-spot on the green one) 
is a simple event. In statistical mechanics, the various possible 
states of a system correspond to simple events. 

It will be convenient to employ geometric terminology. Thus the 
simple events for an experiment will henceforth be considered to be 
points in a certain space**, which will be called the sample space 
for the experiment. The sample space is sometimes called the event 
space. In statistical mechanics it is called phase space. 

The concepts of simple event and of sample space are the basic 
undefined elements of our theory, in just the same way that points 
and straight lines are the undefined elements in an axiomatic treatment 


Gastheorie Hand in Hand gehe.” 
Prof. Kac surmises that Hilbert would not have been satisfied by the 
measure-theoretical approach outlined in the sequel in the present paper. 


*The treatment jn this section and the next roughly parallels that given 
in Kolmogorov illj, but certain si The terminology 
generally follows Feller (4) and Halmos te 


**The word space is used in this article in the standard sense of modern 
analysis; it simply means a collection of elements. 
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of Euclidean geometry. 

We have been using the two words “ simple event” instead of just the 
single word “event’’ because we propose to use the unmodified word 
“event” to denote something less fundamental; namely a result of an 
experiment which is obtained if any one of several of the simple events 
occurs. For example, in the dice-tossing experiment we might be in- 
terested in obtaining a total of seven in one toss of the two dice. 
This occurs if any one of a number of the simple events occurs; they 
are the various individual throws which add up tc a seven. In the card 
dealing case, it might be cmvenient to consider all at once the deals 
in which North is dealt only spades. This is a composite event again — 
or just an “event” in the terminology we shall adopt —- which is brought 
about 1f any one of the numerous deals occurs (each of which is a simple 
event) which put all the spades into North’s hand. 

Noving over to the geometric picture again, it is natural to repre- 
sent an event, in the sense just introduced, by the set or aggregate 
of points in the sample space which consists of all of the simple 
events whose occurrence makes it occur. Henceforth then, an event will 
mean the same as a set of points in a sample space. 

We shall use capital letters like A to denote individual points 
(simple events) in the sample space and lower case italicized letters 
like a, b, ..., to denote sets of points (events) in the sample space. 
Jf course we do not intend by this notation to exclude the case in 
which a point set a cmsists cf only one point. 

If the sample space contains only a finite number of points, or a 
countable infinity of them (this means that they can be arranged in 
a sequence A,, A,, ...), then the sample space is called discrete. 
We shall confine ourselved to discrete sample spaces in the next few 
paragraphs. 

Swpose that (at least in imagination) an experiment can be repeated 
over and over in such a way that from intuitive or experimental con- 
siderations there 1s reason to hope that the respective relative 
frequencies of the occurrence of the simple events will become stabilized 
near certain fixed numbers as the repetitions proceed. Then with each 
point A of the sample space, we shall associate a real number p(A), 
0 < p(A) $1, which is to be regarded physically as the idealized 
value, or hypothetical value, or proposed value, of the corresponding 
relative frequency. We call it the probability of A. 

Following the relative frequency guide, we define the probability 
of an event a consisting of two or more distinct simple events to be 
the sum of the probabilities of the component simple events. The proba- 
bility of a will be written pr(a). The case in which a consists of an 
infinite sequence A; Ay ... 18 included; the meaning assigned to 
the infinite series 2 pr(A; ) which arises in this connection is the 
usual one of oe mathematical analysis. 

Further following the relative frequency guide, the assignment of 
the function p(A) is made so that 2p(A;) = 1, where the summation 
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is to be extended over all the points A,, A,, .. of the sample space. 

The probability distribution in the sample space is defined to be 
pr(a), considered as a function of the set a. Alternatively, in the 
discrete case here under consideration, it can be defined to be the 
point function p(A). 

We stop here for some observations of a philosophical nature. In 
the first place, the reader should be careful to distinguish the mathe- 
matical elements in the above material from the physical ones, even 
though we have deliberately put them side by side. All references to 
relative frequency were only for motivation and guidance as to the 
appropriate features of the mathematical model. “Mathematically, proba- 
bility as viewed here is merely a non-negative, bounded function defined 
on the points of the sample space and having certain additive properties. 
The actual choice of the function p(A) in a given instance, and the 
psychological implications involved in calling it a probability, are 
matters which lie outside the realm of mathematics. 

But the reader may still wonder if the specification of p(A) is 
really not the heart of the whole matter; 1f 1t cannot be done accurate- 
ly, then the whole theory may be useless and it is futile to go on 
and develop it further. There are several answers to this question. 
In the first place, it 1s perfectly true that in the present physical 
formulation, and also with a formulation based on limits (in the sense 
of classical mathematical analysis) of relative frequencies in infinite 
sequences of experiments, probabilities can never be assigned to events 
with finality. Therefore with such a theory every probability speci- 
fication is forced to play the role of a scientific hypothesis. But 
researchers in the present-day mathematical theory of probability 
have been notably successful in developing tools within the theory 
for the testing of such hypotheses in the light of observed data. With 
such tests at hand, the hypothesis viewpoint is quite in the spirit of 
modern empirical science, in which the cycle of hypotheses ~ theoretic- 
al consequences ~ obtaining data ~ testing of theory with the data ~ new 
hypotheses, 1s a standard part of the methodology. 

Then too, surprisingly useful information about the general behavior 
of a stochastic* system can sometimes be obtained even with a very 
crude specificatim of the initial probabilities. For example, sequences 
of tosses of a coin exhibit certain interesting regularity and irregu- 
larity properties which are pretty much independent of whether or not 
it is assumed that the probability of a head is %, And finally there 
is a big body of asymptotic probability results in which the eventual 
distributions of certain stochastic systems are shown to be approximate- 
ly independent of the initial specification. 

Another less profound question which may be bothering the reader 
is this: how does the above definition of probability meet the require- 
ments of the classical elementary combinatorial problems in probability 


*Stochastic is a synonym for probabilistic. 
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theory? The answer is easy to give. In the language of our present 
discussion, the typical combinatorial problem in the college algebra 
text-books describes in more or less vague terms an experiment which 
clearly calls for a finite discrete sample space. Considerations of 
symmetry, and of experience (they are sometimes dressed up in some 
rather shaky philosophy which goes back to the eighteenth century) 


suggest that the appropriate probability specification 1s one which 
assigns equal probabilities to all the points in the sample space. 
The problem then asks for the probability of a certain event a. The 
solution 1s merely a matter of counting up the sample points whach 


comprise a. The task can be by no means trivial, and is facilitated 
by using the various tools of combinatorial analysis. 

All the classical “urn problems’’ are of this character. For example, 
there 1s the old urn problem (translated here from urn language to be 
more entertaining to the reader) which asks for the probability that 
if a person writes n letters and puts them “at random” in the addressed 
envelopes, all the letters go astray. The sample space consists of 
the n! assignments of the letters to the envelopes; the implication 
is that all assignments are to be probabilized equally. The answer 
involves some rather careful counting, and comes out to be (curiously 
enough) approximately 1/e for large n, where e is the well-known 
constant arising in elementary calculus. 

The hypothetical nature of the probability specification is possibly 
somewhat obscured in the case of such impractical problems, but it 
would rapidly come to the fore if one tried to use the answers to 
them in real-life gambling operations. \lany more serious examples 
involving the same techniques, but taken from physics and other branches 
of science, will be found in the newest books on mathematical proba- 
bility, notably those of Feller [4] and Fortet [5]. In these applica- 
tions, the hypothetical character of the probability specification 
is much more apparent, because the specification is clearly a part 
of the hypotheses of a physical theory which must stand or fall on the 
evidence of experimental data. 


4. Extensim to more general sample spaces. The discrete type of 
sample space which has been considered so far is adequate for many 
purposes, but it is not a suitable basis for a general theory. In the 
first place, if we are going to set up an acceptable model for the 
applications in the natural and social sciences, we must provide for 
events which consist of measurements made, at least conceptually, 
on a continuous scale. The mathematician may argue all he wishes that, 
in practice, measurements will always be discrete because they will 
always be rounded off. 4Yis colleagues in the other sciences are not 
going to be satisfied with purely discrete mathematical theories — not 
yet, anyhow. They have good arguments on their side fran the mathe- 
matical viewpoint, too; continuous formulations are often much easier 


to manipulate than discrete ones. 
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But the idea of non-discrete sample spaces turns out to be much 
more than a gadget added to the theory to please the clients; it is 
necessary to attain completeness in the study of certain familiar 
experiments which, when considered abstractly, are seen to consist 
of a non-countable number of simple events. An example is given by 
the game of craps. Another is the problem of the number of tosses of 
a coin required to get the first head. To see why the simple events 
are uncountable in such cases, consider the latter problem, and suppose 
that no @ priori restrictions are imposed on the number of tosses. 
If we denote a head by 1 and a tail by 0, a typical infinite sequence 
of tosses could be written symbolically as11010001011..., 
and this must be regarded as one of the simple events of the experiment. 
With a binary point out in front of the symbol, it can be regarded 
as a binary number on the unit interval. The sample space then consists 
of all such binary numbers; that is, of all the points in the unit 
interval. Now it is well known that the points of the unit interval 
are not countable, so a more general kind of sample space is called 
for than that which we have heretofore considered.* 

True, this example (and that of the crap game) can be given a satis- 
factory mathematical treatment by restricting the number of throws 
a priori to a finite number - that would have to be done in real life 
anyway — and then resorting to limiting processes. But to study the 
structure of “random’’ sequences like the sequence of 0’s and 1’s above 
and to deal satisfactorily with questions of the type which Von Mises 
attempted to answer in his axlomatics, it is quite necessary to free 
the theory from any obvious restrictions as to finiteness. This becomes 
even more urgent in the case of certain important applications to 
economics and physics, where it 1s necessary to think of a stochastic 
system evolving continuously in time. 

We shall accordingly now outline an approach which provides a wide 
generalization of the definition of mathematical probability previously 
given. 

The clue to the way to proceed lies in a closer study of the idea 
of an event. We have identified an event with a set of points in the 
sample space. This set consists of the simple events, the occurrence 
of any one of which makes the event occur. Vow to every event a, we 
sense that there must correspond another event a’ which has the property 
that if a does not occur, a’ does, and vice versa. Thus we are led 
to define for each event a its complementary event a’, which consists 
of all of the points of the sample space not in a’. Another intuitive 
idea is that of at least one of two events a and 6 occurring. This will 
happen if the experiment results in a simple event contained in either 


*Incidentally, this sort of correspondence between sequences of trials 
and numbers on the unit interval has providdd a useful connection between 
number theory and probability theory; see Feller [4] pp. 161-163. 


( 
] 
] 
Cc 
t 
e 
: Ss 
U 
4 e 
a 
Ol 
1! 
Si 
Wwe 
le 
ar 
th 
sy 
Th 
of 
em 

a 

a 
ab 
4 th 
to 
of 
in 
re: 
of 


(1953) ELEMENTS OF A MATHEMATICAL THECRY OF PROBABILITY 2 43 


a or 6 or in the set common to both. We shall therefore define for 
every pair of events a and b another event denoted by a V b, which 
is identified with the set of all points in th:« sample space which 
belong to at least one of the sets a and b.-In point set theory, a’ b 
is called the union of a and b. Of course the extension to a finite 
collection of events is immediate; aU bU cWU... means the aggregate 
of points belonging to at least one of the sets a, 6, c, .... 

Still another intuitive idea associated with two events a and b is 
that of both of them occurring. This will take place if any simple 
event occurs for which the corresponding point lies in both a and 6 
simultaneously. We shall formalize this by defining for each pair a and 
b a new event af\ b, represented by the set of all points in the sample 
space common to both a and b. In point set terminology, a/b is the 
intersection of a and b. Similarly, if a, b, c, ..., 1s a finite coll- 
ection of events, af\ bf\ c'f\...,» represents the simultaneous occurrence 
of all of these events. 

Of course, af\ b may be empty; this is the case in which a and 6 
are disjoint or are mutually exclusive. It is convenient to enlarge 
our concept of event slightly to include this case too. We therefore 
invent the “ impossible ” event 0, and say that a = o means that the 
set @ 1s empty. 

It is also convenient to have a symbol, say e, for the “ certain’ 
event consisting of all the points in the sample space. 

Now the really important thing about the operations ',U,\, which 
we have just introduced, is that they clearly must obey ¢ertain algebraic 
laws. For example, aU b = b Ya, with a similar relation for /\; these 
are commutative laws. There is an associative law, also, for each of 
these two symbols. There are various obvious relations between the 
symbols; for example, (a 6)’ = a’ U6, af (b Uc) = (aN (a oc). 
The latter 1s a distributive law. 

It now begins to appear that the naive, seemingly formless idea 
of an event 1S not so innocent as it seemed at first, and that a math- 
ematical treatment is possible. There is in fact an algebra of events. 
This algebra is of a recognizable type which has long been studied. 
If a non-empty class a of elements is such that (i) if two elements 
a and b are both in a, then aV 6 is also in a, and (ii) if a is in 
a then a’ is in a, where VU and ‘ satisfy the algebraic laws sketched 
above, then this class a is called a Boolean algebra. It is clear 
that with the extensions of the concept of event given above, the 
totality of events for any experiment may be viewed as the elements 
of a Boolean algebra. 

To take care of certain complicated situations such as that of the 
infinite series of coin tosses mentioned above, it turns out that 
a restriction must be imposed on the class of admissible events. This 
restriction consists of requiring that the union of an infinite sequence 
of events shall have a well-defined meaning and shall itself be an 
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event in the algebra. In the cain tossing case, it 1s reasonable to 
talk about the particular event a of the coin never turning up a head 
at all. If b 1s the event that head turns up for the first time on 


the j-th roll, then it is pretty obvious that a’ (the event that the 
coin does eventually turn up a head) should be the union of b,, bo, oes y 


provided we can assign a precise meaning to that. (It is true that 
this example 1s somewhat artificial, as it can be handled by limiting 
processes and anyhow we feel that here a = 0, a’ = e, which we have 
already agreed to put in our algebra. Further discussion of the need 
for the restriction will be found below. ) 

To put this new restriction in precise terms, we first introduce 
another fundamental symbol. If a and b are any two events which satisfy 
the relation a U b= b, we write a Cb and say (in the language of 
events) “a implies b,” or (in the language of point sets) “ais contained 
in b,” or “ais smaller than b.” We consider a = 6 to be a special 
case of aC b. Given an aggregate of elements, it 1s now possible to 
talk about the smallest one. If it exists, it is an element a in the 
aggregate such that if 6 is any other element in the aggregate, then 
ac 

Now consider an arbitrary infinite sequence a,, a,, ... of elements 
of a Boolean algebra. Let 6 be an element of the algebra which contains 
every one of the elements a,, Qo, see Consider the class of all such 
elements b. If this class contains a smallest one, say a, then we 
define a, V a,V a,VU ... to mean a. If such an element a exists for 
every such sequence a,, Ao, the algebra is called a J7-algebra. 

Qur restriction on the class of admissible events is that it shall 
always be a J-algebra. 

We are now ready to define mathematical probability for general 
sample spaces. Once again, we shall let the obvious properties of 
relative frequencies guide us, and furnish motivation. Consider the 
v-algebra of the events for a given experiment. Suppose that the ex- 
periment can be repeated over and over (at least in the imagination) 
in such a way that the relative frequency of occurrence of any event 
a will become stabilized about a certain fixed value as the repetitions 
proceed. Then with each event a, we associate a real number, Pr(a), 
0 < Pr(a) ¢ 1, which is regarded physically as the idealized value, 
or hypothetical value, or proposed value, of the corresponding relative 
frequency. This number is called the probability of a. 

It will be remembered that the 7-algebra of events contains the 
impossible event o and the certain event e. To these we assign respec- 
tively the values Pr(o) = 0, Pr(e) = 1. 

Again following the relative frequency guide, if a and 6 are any 
two mutually exclusive events, then we define Pr(a VU b) = Pr(a) + Pr(b). 

The properties of Pr(a) postulated so far imply that if a,, a,, ... 
is any infinite sequence of mutually exclusive events, then Pr(a, U a,VU 
... a.) = Pr(a,) + Pr(a,) + ... + Pr(a,) for n finite. Furthermore the 
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limit as n becomes infinite of the sum on the right-hand side must 
exist, because the sum is monotonically non-decreasing and yet bounded 
by unity. In other words, the infinite series Pr(a,) + Pr(a,) + 

has a sum in the usual sense. Vow it will be remembered that a,Va,V... 
is a member of the o-algebra, and so a probability Pr(a, UV a,VU aad 
has been assigned to it. Our final stipulation regarding the set-function 
Pr(a) is that the sum of the convergent series Pr(a, ) + Pr (a,) + 

is precisely equal to Pr(a, VU a,VU...). In technical language, this 
means that the set-function Pr(a) is countably additive. 

The set-function Pr(a) is called the probability distribution for 
the experiment. 

At the beginning of Section 2 of this article, we asked the question, 
“Just what is probability?’’ The developments of the present section 
have put us in no better position to answer this question in its philo- 
sophical sense. But we can now give a completely precise answer to 
the question, “Just what is mathematical probability in the present 
formulation?”’ It is simply a countably additive, non-negative set 
function, bounded from above by unity, defined on the elements of a 
7-algebra, 

It is easily seen that the definition of probability for discrete 
sample spaces 1s merely a special case of the general definition given 
immediately above. The comments mde in the discrete case in connection 
with the problem of specifying the probability distribution still hold 
true. In the applications, the specification is always in the nature 
of a scientific hypothesis. 

In mathematical statistics, where testing of hypotheses is of primary 
interest, 1t 1S a common practice to specify for an experiment not 
just a single probability distribution Pr(a), but all at once a whole 
family. of probability distributions Pr(a;@). The parameter 0, which may 
be multidimensional, varies on a space {2 of * admissible hypotheses”. 

The extreme values 0 and 1 of Pr(a) require some attention. We 
do not exclude the possibility of assigning the probability 0 to a set 
a other than o, if it seems reasonable to do so. This would mean in 
practice that the occurrence cf ais thought to be unlikely, but not 
necessarily impossible. Similarly, we do not exclude the assignment 
of a probability of 1 to an event other than e. It should be remem- 
bered that a probability 1s not supposed to reproduce a precisely 
determined value cf a relative frequency, and — we say this again, at 
the risk of being repetitious -— in applying the present theory every 
mathematical probability is to be regarded as only a scientific hy- 
pothesis. 

The upper bound of unity on Pr(a) is of course suggested by the 
situation for relative frequencies. ‘lowever, if other intuitive or 
physical concepts of probability are used in connection with the algebra 
of events which we built up in the preceding section, then there may 
be no reason why the scale of Pr(a) should be limited to the unit 


y 
> 
a 


246 MATHEMATICS MAGAZINE (May-June 
interval. In fact, it might even be convenient to put the values of 
Pr(a) in the complex plane*. 

There is a close relationship between mathematical probability as 
defined above and Lebesgue measure; an elementary discussion will be 
found in the article of Halmos [7]. 

It is worth emphasizing, although it is a rather technical point, 
that in the present treatment a probability has not been assigned to 
all events a in a given sample space, but just to those which belong 
to a J-algebra. Thus in the case of the experiment consisting of an 
infinite sequence of coin tosses, suppose that the sample space is 
chosen to be the unit interval. Then not all point sets (that is, 
events) in the interval would be “probabilized” if the present defini- 
tion of probability is used, but only the point sets belonging to a 
J-algebra m this interval. 

If the restriction that the algebra of events to be probabilized 
must be a J-algebra is dropped, and if the sample space is a relatively 
complicated one (for instance, if it can be put into one-to-one corre- 
spondence with the unit interval) then there may be events on which 
a non-negative, additive set function just cannot be defined in a 
unique and non-contradictory way. The question is much the same as 
that of the existence of non-measurable sets in the Lebesgue theory, 
which is discussed in detail in the modern measure-theory texts (see 
for example [8], pp. 67-72). Of course, such “non-probabilizable” events 
may be artificial, and even pathological, and certainly without physical 
interpretation, but that would not make the mathematician any happier 
if he had to deal with an inconsistent theory. 

Q the other hand, there are sophisticated cases in which the restric- 
tion to a J-algebra is certainly too severe. The lifting of the re- 
striction has recently been the subject of a gad deal of research, 
and certain technical questions about the existence of “non-probabiliz- 
able’ events (using definitions of prdability suitably modified over 
the one given here) have not yet been settled.** 


5. Further definitions; elementary theorems. Using only the definitions 
given in Section 4 together with the algebra of events, certain mcre 
or less obvious theorems can be proved. We shall exhibit only one 
of them - one of considerable importance in the theory. 

Notice the identity aUb=aNb' +aNb+a't) b= ata'N b= 
b + b'(\ a. This is derivable from simpler, more fundamental identities, 
but if the reader will translate it aloud into event language, 1t 
will be very obvious. By the additive property of probability, we have, 


*In quantum theory in physics, certain complex functions are introduced 
called probability amplitudes which are given a probabilistic signifi- 
cance. The squares of their absolute values are treated as ordinary 
probability densities (this term is defined below in section 8). See 
Heisenberg [9], Chap. IV. 

Pey footnotes on pp. 50, 53, 164. 


** See 
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Pr(a VU 6b) = Pr(a fb) + Pr(b'O a) + Prla'N 5), 
Pr(a VU b) = Pr(a) + Pr(a'A b), 
Pr(ay = Pr(b) + Pr(b'fN a). 


Combining these, we obtain: 
THECREM 1. Pr(a WU 6) = Pr(a) + Pr(b) - Pr(a Mb). 


This says that the probability that one or the other of two events 
occurs 1s equal to the sum of the probabilities that they occur indi- 
vidually minus the probability that they both occur simul taneously. 
It clearly reduces to a mere statement of the additive property of 
probability if Pr(a )\ b) = 0. 

Consider now the quantity Pr(a 1b); that is, the probability of 
the two indicated events occurring simultaneously. It 1s natural to put 
a slightly different twist on this concept, and look into the possibility 
of defining the probability of the event which consists of the occurrence 
of b when a is known to have occurred. Once again, relative frequencies 
will be our guide. If in N trials of an experiment, a occurs Ma) times, 
b occurs Mb) times, and they both occur together Mbf\ a) times, 
then M(b (¥a)/N(a) is the relative frequency of occurrence of 6b within 
the sequence of trials in which a occurred. This fraction can be written 


as 
fa) 
a) N 
N 


and we have already defined the abstractions of the numerator and 
denominator on the right side as respectively Pr(b a) and Pr(a). 

We are thus led to the definition of the conditional probability of 
b, given a, as Pr(b|a) = Pr(b NM a)/Pr(a). The tacit assumption is 
always made in defining conditional probabil ity as a quotient that the 
probability in the denominator 1s not equal to zero. 

With a held fixed, it is easy to see that the class of events of 
the type b Va forms a 7-algebra, with e= af\a. If b/\ a= o, then 
Pr(b | a) = 0/Pr(a)=0. If b = a, Pr(b| a) = Pr(a)/Pr(a) = 1. Also , 
Pr(b | a) satisfies the addition rule of probabilities, because by the 
distributive law of the event algebra* 


Pr(b,U by Ub; U wee | a) = Pri(b,U U b3U al /Pr(a) 
= Pr[(b, a) a)U ...)/Pr(a) 
= Pr(b, Ma)/Pr(a) + Pr(b, Va) /Pr(a) + ... 


* It is used to pass between the second and third members of the equation below. 
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Since a = b ‘la + 6'/fM a, it follows that another way to write 
Pr(b |a) is Pr(b Na)/[Pr(b a) + Pr(b'A a)]; this shows that 
Pr(b| a) <1, 

The upshot of all this is that conditional probability has all the 
mathematical properties of just plain probability, and satisfies all 
the general theorems. 

Conditional probability is a concept of central importance from 
both the philosophical and the mathematical point of view. It is not 
an exaggeration to say that practically all of the philosophical and 
psychological theories of probability somewhere harbor the idea that 
the intensity of one’s belief in the occurrence of an uncertain event, 
or in the existence of a situation, or in the * truth” of a statement, 
1s always relative to the evidence at hand. In a sense, then, proba- 
bility is almost always used in a conditional sense by all human 
beings. From a less philosophical viewpoint, conditional probability 
is essential in various ways for the development of the mathematical 
model. The most fundamental application is that it provides the simplest 
way to approach the very important concept of statistical independence. 
Then too, it 1s one of the tools required for the study of time series 
and sequences of associated or correlated events — phenomena which 
under the name of “ stochastic processes’ are receiving a great deal 
of attention in the literature today. 

There are several ways to rewrite the definition of conditional 
probability. Because of their historical interest we shall elevate 
two of the formulas to the dignity of theorems. The first is what is 
sometimes known as the theorem, or law, of compound probabilities: 


THEOREM 2. Pri(a \b) = Pr(a)Pr(b| a). 


The second formula has the distinction, unusual in a discipline 
so long associated with games of chance, of being named after an English 
clergyman. Let b,, b,, ... be a sequence of events which are mutually 
exclusive and such that e = 6, VU 6, U_... . Then a= a ‘Ve = 
(a (¥b,) V(a b,)U... Also, (aM 6,) (aNb.) = 0, iF j, 
and so by the additive property of probability, Pr (a) = > Pr(a 1 b.). 
We substitute this into the denominator of the fraction which defines 
Pr(b, |a) and then use Theorem 2 on both numerator and denominator, 


obtaining 
Pr( by) Pra | by ) 
2,Pr(b;)Pr(a | 


THEOREM 3. Pr(b, | a) 


This 1s called Bayes’ theorem. It has caused a great deal of ex- 
citement and controversy in its day. For a long time it was the principal 
tool of statistical inference. The classical interpretation was to 
view the events 6, as “causes’’ and the event a as the result of an 
experiment affected by these causes. The formula then was supposed 
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to give the probability that b, caused a. In many cases it is quite 
possible to deduce reasonable values for the probabilities Pr(a | b, ) 
which appear in the formula, but the hitch in practice lies in properly 
estimating the “a priori probabilities” Pr(b;), 
limitations of itayes’ theorem are now pretty well understood, and 
other much better techniques for statistical inference are now avail- 
able. * 

We shall now define statistical independenc2; that is, independence 
in the probabilitistic (as opposed to functional) sense. This is the 
last of our really fundamental definitions. 

The intuitive meaning of the independence of the events a and 6 
is that the knowledge that a lias occurred in no way affects our ex- 
pectations as to the occurrence of b. Returning to the relative frequency 
guide, this should mean that N(b)/N and N(b \a)/N(a) ought to be about 
the same; that 1s, the occurrence or non-occurrence of a ought to have 
no bearing on the frequency of occurrence of 6. This leads us to make 
the following definition: The events a and 6 are statistically inde- 
pendent if and only if Pr(b| a) = P(b). 

Another way to write this, in view of Theorem 2, is Pr(af\ b) = 
Pr(a)Pr(b). From this 1t is quickly seen that if a and 6b are independent, 
then not only do we have Pr(b| a) = Pr(b), but also Pr(a| b) = Pr(a). 

The appropriate extension to the case of three events a, 6, and c 

might seem to be via the route of requiring them to be independent 
in pairs, but it turns out that this 1s not enough for a satisfactory 
definition. For example, 1t does not insure that Pr(a 4b “c)/Pr(a) = 
Pr(b(\ c).** The easy way out of the difficulty 1s to say that three 
events: are independent if (1) they are independent in pairs, and (2) 
Pr(a ¢\bf\ c) = Pr(a)Pr(b)Pr(c). The generalization to more than three 
events proceeds similarly. 
§. Random Variables. “ost of the results of modern mathematical proba- 
bility theory are stated in a convenient although special language: 
that of random variables. It 1s a terminology which mathematical purists 
have always found rather irritating, because random variables are 
not independent variables in the strict mathematical sense (they are 
functions), and on closer acquaintance they do not seem to be very 
random, whatever that may mean. Furthermore, the standard notation 
used to represent a random variable is somewhat inexplicit. 

Nevertheless the concept is intuitively helpful and satisfying. 
A random variable roughly means a quantity which is associated with 
the simple events of an experiment to which a probability distribution 


*See Cramer (1), in particular pp. 507 ff. It is only fair to state that 
with other physical concepts of probability than the one used in the 
present article, Bayes’ theorem can be much more important. 


**See Feller [4], p. 87. 
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has been assigned; and this quantity assumes various values with the 
probabilities derived from the association. In brief, it is a variable 
quantity whose values are determined by chance. 

To arrive at a more precise definition, we must make use of an 
idea borrowed from the modern theory of functions of a real variable; 
namely, that of a measurable function.* Consider the sample space for 
an experiment, and the associated 7-algebra of events. Py a measurable 
function in the present context, we mean a vector functim X= \(A) = 
[X,(A), ECA), X,(A)] whose values lie in a Euclidean k-space, 
and which is defined and finite (that is, none of the components X.(A) 
is infinite) for each point A in the sample space, and which has the 
property that the set of points A in the sample space such that simul- 
taneously X,(A) C,, X,(A) X (A) < is always a member 
of the 7-algebra for any and all values of the C’s, including infinite 
ones. 

Any such measurable function defined on a sample space will be called 
a random variable. 

A random variable is always taken to be a single-valued function, 
in the sense that to any point A in the sample space there corresponds 
just one point in the k-dimensional Euclidean space. This plays an 
essential role later on. If k = 2, the random variable may be presented 
in complex-variable notation as X = X, + iX,; there is no change in 
the definition of measurability if this is done. If k = 1, we drop 
the bold-face type and use the same notation for the random variable 
as was used above for the components. 

It 1s not unusual to encounter situations in which several different 
random variables are defined all at once over a sample space. If they 
are all one-dimensional random variables, they may be thought of as 
being the components of a single vector random variable. However, a 
warning must be given with respect to the interpretation of vector 
random variables constructed in this way. Consider a concrete example. 
A coin is tossed twice. The simple events of the experiment, in an 
obvious code, are HH, HT, TH, TT. Let X and Y be two random variables 
set up simultaneously on the sample space, wth X assuming the values 
1, 2, 3, 4 on HH, HT, TH, TT respectively, and Y the values 7, 8, 
9, 10 on HH, HT, TH, TT. Now (X,Y) could be considered as a vector 
random variable whose space was the XY-plane, but the values that this 
vector variable takes on as the argument ranges over the sample space 
are only the values (1,7), (2,8), (3,9), (4,10). Other points of the 
plane, such as (1,8), even though they lie in the product space** 


*See the article [6] by J. W. Green in this series. 
**Given a space of elements (or points) X, and another space of elements 
(or points) Y, their product space (or Cartesian product space) is simply 
the set of all ordered pairs (X,Y). Thus the Euclidean plane is the 
product space of two coordinate axes. The Cartesian product of more than 


two spaces is defined similarly. 


( 
; 
1 
0! 
t 
a 
tl 
bi 
me 
Ti 
cl 
Sl 
SF 
F di 
re 
Ov 
be 
i Sl 
sp 
we 


1953) ELEMENTS OF A MATHEMATICAL THEORY OF PROBABILITY 251 


of the values of X and Y, would not be included among the values of 
the vector variable. (This product space would pertain to a new, more 
complicated experiment. ) 

A value of a random variable X obtained by actually performing the 
underlying experiment, getting the event A, and therefrom deriving 
a numerical value for X through the functional correspondence X = X(A), 
is called a determination of X, or an observation on X, or a realization 
of X. This terminology, of course, pertains to concepts which lie on 
the bridge between mathematics and the physical world, and are not a 
part of the mathematical structure which we are now erecting. 

Random variables are very often initially introduced into a proba- 
bility problem just to provide numerical labels for the points of the 
sample space. This step becomes so natural to workers in probability 
and statistics that they often take it unconsciously before starting 
a discussion. The more fundamental concept of sample space is almost 
never mentioned in the literature of the mathematical theory of proba- 
bility except in discussions of the foundations. 

The transition from sample space to random variables in many cases 
is made mcre painless by the circumstance that the most reasonable 
system of coding for an experiment often drops a random variable in 
one’s lap, so to speak. An example is the experiment consisting of the 
toss of a die. ere the obvious way to designate the six simple events 
A,, Aj, «++, 4A, 1s by the number of spots shown; but this sets up a 
random variable on the sample space at once. Of course it may be desir- 
able to define simultaneously a number of other random variables on 
this same sample space; one of them, for instance, might be that given 
through the formula Y = exp(iX), where X is the number of spots shown. 

More complicated examples along this line are those given by “proba- 
bilizable” physical measurements made on a continuous scale. If the 
measurements themselves are numbers, as they usually are, they furnish 
random variables at once. A mathematical justification for short- 
circuiting the sample space concept in such a case, and in many other 
situations as well, will be given in a moment. 

The motion of random variable as defined above, like that of sample 
space, is sterile without the associated concept cf the probability 
distribution of a random variable. The general idea is simply that 
this distribution is to be derived from that in the sample space by 
making corresponding sets in the space of the random variable and in 
the sample space have the same probability. To do this consistently 
requires some little attention to detail. 

To be specific, let X = X(A) be a measurable vector function defined 
over a sample space on which a probability distribution Pr(a) has 
been set up. For convenience we enlarge (if necessary) the space con- 
sisting of all the values taken on by X(A), as A ranges over the sample 
space, so that it is an entire k-dimensional Euclidean space, which 
we henceforth call the X-space. By the inverse image of a set X lu. 


% 


252 MATHEMATICS MAGAZINE (May-June 


the X-space, we mean the set of points a, in the sample space such 
that if a point A lies in a, (written A€a,) then X(A) lies in x, 
Notice that by the definition of single-valued function, every A has 
one and only one X, although a single value of X may correspond to 
several points A; also that if the values of X(A) do not fill it up 
the X-space completely as A ranges over the sample space, there wll 
be some sets x with empty inverse images. The inverse image of an 
empty set will be, by agreement, the empty set. The inverse image of 


the entire X-space is the entire sample space, or e, in our previous 


terminology. 
We must first restrict the class of sets x in the X-space to be 
“‘probabilized’’, because otherwise their inverse images might not 


belong to the 7-algebra in the sample space. (It will be recalled that 
the definition of measurable function insures that the inverse images 
of X-sets only of a certain extremely simple type belong to the J-algebra 
in the sample space.) The proper restriction is to the family of Borel 
sets of the X-space. These are defined as the J-algebra of sets of 
Euclidean k-space obtained by applying the operationsU,/\, ', a finite 
or countably infinite number of times to X-sets of the simple type 
appearing in mr definition of measurable function. It follows at once 
that every Borel set x has an inverse image a, belonging to the 7-al- 
gebra in the sample space. 

After all these preliminaries, the probability distribution p(x) 
of the random variable x = X(A) will now be defined. Assuming that 
x is a member of the family of Borel sets in X-space, the definition 
is given simply by the equation P(x) = Pr(a,). 

A few moments’ thought on the part of the reader will assure him 
that this equation defines P(x) uniquely and without contradiction 
as a countably additive set function, with 9 ¢ P(x) ¢ 1. It assigns 
a probability of zero to the empty set (and also to any set whose 
inverse image is empty), and a probability of unity to the whole X-space. 
In other words, P(x) has all the characteristic properties of the 
probability distribution Pr(a) in the sample space. The probability 
distribution of X, as so defined, 1s said to be induced by that in 
the sample space. 

No modification to the procedure for inducing distributions is 
required for random variables which are complex-valued functions. 
They are treated simply like vector functions with k = 2. 

If Y = Y(X) as a single-valued Porel-measurable function of X (meaning 
that the inverse image of the interval Y $ Yo is always a Porel set 
for every Yo), then clearly Y = Y(X(A)) 1s a measurable function defined 
on the sample space, so Y is itself a random variable. Its distribution 
may be derived from that of X (or from that in the sample space - the 
result is the same) in exactly the same way as that in which the dis- 
tribution of X(A) was obtained. The extension of these remarks to 
vector functions Y = Y(X) is immediate. 
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Although a Borel-measurable function Y = ¥(X) maps any Borel set 
of the Y-space back onto a Porel set of the X-space, this is not 
necessarily true for a Lebesgue-measurable function and Lebesgue- 
measurable sets. This is one of the reasons for using Borel sets and 
Borel-measurable functions in probability theory. 

The mechanism by which the probability distribution in the sample 
space induces a probability distribution in the X-space depends heavily 
on the single-valued character of the function X = X(A). This prevents 
two disjoint sets x, and x, from having the same inverse image ay. 
If this had been allowed to happen, then there obviously would have 
been difficulty in apportioning Pr( ay) to x, and x, individually. 

To put this all in another way, if X = X(A) is single-valued but 
not inversely single-valued, the probability distribution of X can 
be anduced as above, but the process cannot be made to work the other 
way: that 1s, given the distribution of X, this distribution does not 
induce a unique distributim in the sample space. 9n the other hand, 
if X = (A) is inversely single-valued, as would be the case 1f X(A) were 
introduced merely to label the individual points of the sample space 
with distinct numerical labels, then by reversing the roles of the 
sample space and the X-space in the above discussion, the probability 
distribution in the sample space can be uniquely derived from that in 
the X-space. Under these circumstances, 1t makes no difference at all 
which distribution is specified first. Similar remarks apply to the 
case of a function Y = ¥(X), with the roles of A and X assumed by X 
and Y. 

This is the mathematical justification promised earlier in this 
section for putting primary emphasis on random variables instead of 
sample spaces when working in probability and statistics, even if the 
concept of sample space is the more fundamental one. ‘ost sample spaces 
arising in practice are of a character such that their points can be 
given distinct numerical labels whose values lie in the space of all 
real numbers, or in Cartesian products* of spaces of real numbers. As 
stated before, this process is so natural that it is often performed 
unconsciously before starting a problem. In fact in many cases it would 
be merely pedantic to distinguish between the sample space and the 
space of the labeling random variables. And the moment the labeling 
has been done, thenceforth the corresponding random variables might 
as well occupy the center of the stage as far as probability considera- 
tions are concerned. 

A constant C can be treated as a random variable because it can be 
viewed as a function. In fact, it often is so treated in the literature 
of mathematical probability. Its distribution, when induced according 


*Actually, uncountably many factors may be needed; that is, sample spaces 
arise in practice which are abstractly identical with function spaces. 
Two methods of specifying probabjlity distribution in function spaces 
are described and compared in t I 
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to the above rules by that in any sample space, is always one ir which 
a probability of unity is assigned to the point C, and zero to any set 
not containing C, 

The modern literature of probability @d statistics contains countless 
examples cf random variables and induced probability distributions. 
Here we shall have space only for some very simple concrete illustrations 
of the process of inducing a probability distribution for a random 
variable. They will all be based on the coin tossing experiment tention- 
ed earlier, in which the simple events were HH, HT, TH, TT. 

A natural probability specification in the sample space for this 
experiment 1s one which assigns equal probability to each of the simple 
events. This implies that each of them gets the probability 1/4. One 
of the various random variables proposed previously merely labeled 
these points respectively 1, 2, 3, 4. Its induced prdability distribu- 
tion is therefore given by the equations P(j) = 1/4, j; = 1, 2, 3, 4. 
Another possible random variable, not mentioned previously, is a variable 
X which assumes the value 1 on any of the three points in the sample 
space containing H in its code, and 0 on 7T. The total probability 
attached to the inverse image of X = 1 is 3/4, so the distribution of 
X is P(1) = 3/4, P(O) = 1/4. 

Still another possible random variable is a vector function which 
plots the simple events as four points on the Euclidean plane as 
follows: (0,0) for TT, (1,0) for HT, (0,1) for TH, (1,1) for HH. Its 
induced probability distribution is given by P(j,k) = 1/4, j = 9, 1, 
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CYCLOIDAL MOTION OF ELECTRONS 
S. E. Rauch 


1. Introduction. 


The cycloid has found its way into important applications in the 
present fields of atomic physics and electronics. Engineering problems 
which consider the motion of charged particles when moving through 
a region of crossed electric and magnetic fields deal directly with 
cycloidal paths and velocities. The following article analyzes the 
characteristics of the cycloidal path and illustrates briefly two 
applications. 

Let the following concerning forces acting upon an electron be 
assumed as hypotheses: 

a, When a force acts upon an object it produces an acceleration in 
the direction of the force with a magnitude proportional to the force. 
Units are so defined that F(dynes) = m( grams) - a(cm./sec.*), where 
mis the mass being accelerated and a is the acceleration. In vector 
notation VNewton’s law of motion 1s simply stated by F = ma, 

b. When an electron having e charge wits is in an electric field 
E, 1t is accelerated in the direction of the electric field by a force 


F( dynes) = eE,where the units are chosen for e and E so that the 

equality is satisfied. In vector notation the above becomes F = eb, 
c. When a charged particle having e charge mits moves with a velocity 

v(cm. /sec.) relative to the observer across a magnetic field H, a 


T 


F 


force 1s exerted perpendicular to the plane defined by the direction 
of H and v. The magnitude F = (Hev/c) sin@ 1s in dynes, where c is 
the velocity of light in cm./sec., @ the angle between H and v, H 1s 
gauss units so defined to satisfy the equality. Jn vector notation 
the above is summarized by F = (e/c) * (v x H). 
2. Equations of Motion 

The general problem to be studied can be discussed in terms of 
results of a simple case. For this purpose let us first study the 
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motion of an electron when moving in a region with a uniform electric 
field produced by a parallel plate cmdenser superimposed on a uniform 
magnetic field between plane, parallel poles. Define reference axes 
in this region such that E has no component in the y direction, and 
His parallel to the positive z axis. Jn the basis of the preceding 
remarks regarding forces acting upon an electron which is moving in 
an electric and magnetic field, the equations of motion are: 


(2.1) m d?z/dt? = 


(2.2) m d*y/dt* = -(He/c) + dx/dt, 
(2.3) m d*x/dt® = + (He/c) dy/dt 


H 


Integration of (2.1) with respect to t yields 


(2.4) a,t + /2m) 


where z = z,, dz/dt = ‘. when t = 0. Let v, = dx/dt and i> dy/dt. 
Upon differentiation of (2.3) one obtains 


m dv? /dt? = (He/c) > dv, /dt, 


and upon substituting into (2.2) it is found that 


d°v_/dt? + (eH/me)* v, = 0, 


The solution of the latter is 


> A+cos yt + B* sin yt, 


where Y = eH/mc, and an integration with respect to t yields 


£2 3) x= x, + (A/y)sin yt - (B/y)cos yt. 


4 { 
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accerding to (2.3), 


dy/dt = (1/y)d°x/dt* - cE /H; 


therefcre upon integrating with respect to t, assuming y = Y, when 
t = 0, one finds 


y, ~ (cE, /H)t + (1/y)dx/dt, 


y 


¥, ~ (cE, /H)t + (A/y)cos yt + (B/y)sin yt, 


(2.5) (A/y)cos yt + (B/y)sin yt, 


where U = cE /H. 
As further initial conditicns let 
Consequently A = ‘, and B = w. + U. The equations of motion become 


v = i when t = 0, 


+ (eE, /2m)t*, 


a, + (a fy isin - + U)/ylcos yt, 


(x\/Y)ces [(y; + U)/y) sin yt - Ut. 


If G, is so defined that 


sin 6, + (y! +U)?)4 and cos 0, = + U) + +U)?)4, 


then the equations cf motion can be written in the final form: 


0 


(1/y) + + U)?]*cos(yt + 


0 


yo + + + U)?)4sin(yt + - Ut, 


(2.8) x 


where y = eH/mc, U = cE,/H, 6, = arctan ae y, +U). 


3. Parametric Equations of the Cyclaid. 


Let us next consider a circle, radius r, rolling along the y axis 
with an angular velocity y. The parametric equations for the locus 
of a point at a distance d from the center of the circle are: 


y= ryt - d+ sia Ft, 


Ff - @e Ft, 


0. By definition the locus of the 


where y = 0, x = r- d when t = 
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point 1s a prolate cycloid if d® r, cycloid if d = r, and a curtate 


cycloidifd*<r,. 
A more general form of these equations can be quickly established 


by considering first a translation of axes. Then 


d*cos Yt. 


It t # 0 when y = 0, then a phase angle G, can be defined to satisfy 


r(yt + 0.) - d‘sin(yt + 6), 


d*cos Ot + 


A comparison of the equations of motion (2.7). (2.8). (2.9) with 
the above equations shows that the electron has two characteristic 


motions: 
(1) a component. of motion preducing a parabolic path parallel to 


the magnetic field, 

(2) a cycloidal component of motion perpendicular to the magnetic 
field. 

In regards to the cycloidal motion certain quantities to which the 
physicist refers in his analysis of physical problems are important 
to mention. The definitions are given in terms of the rolling circle 
producing the cycloid but they can be easily specialized to discuss 
the motion of electrons in uniform crossed electric and magnetic fields. 

Frequency v. The angular velocity of the rolling circle associated 
with the cycloid 1s Y. Frequency, which is defined as the number of 
complete revolutions of the rolling circle yer second, is 


v = y/2n, 


Period 7. The period 1s defined as the time required for a complete 
revolution of the rolling circle. Thus 


T = 


Drift velocity U. The drift velocity is defined as the linear 
velocity of the center of the rolling circle and in magnitude is 


U= ry. 


Amplitude d. The amplitude dis the distance from the center of 
the rolling circle to the point on the extended radius whose locus 
produces the cycloid. It is clear from (2.8) that 


d= (1/y)[xt? + (y, + 


Fy 


Fo 


= 
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If ‘, and Yo are expressed in units of U, then d can be written in the 


form 


d = + + 1)7)4, 


It at once follows that the locus 1s a prolate cycloid, cycloid, or 
curtate cycloid according to whether + is greater than, 
equal to, cr less than 1 respectively. 

Path length per cycle A. With the use of the familiar calculus 


definition of length one easily obtains 


ds = + + + - + + + 6,))*dt, 


= [ad?y? + - 2Udy cos(yt + 


= [d?y? + - Qdry*cos(yt + 6, ))*dt, 


= yld? + r? - Qdr+cos(yt + 6,)]*dt. 


Integration with respect to t between the limts of 0 and T yields 


T 1 
Kz y [d* + r* - Qdr*cos(yt + @,)]*dt. 


For the special case where the initial velocities vanish, yielding 
d = r, a well known result is obtained. 


1 T 1 
ry [1 - cos yt] dt, 


= 8r. 


By substituting the maximum and minimum values of +1 for cos(yt + @,) 
it is possible to establish rough bounds between which A must lie, 
that is ld - rflT <A< ld + r|T. 

Average squared velocity per cycle, v*/cycle. The use of (dv)* = 
(dx/dt)* + (dy/dt)* and the parametric equations (2.8) and (2.9) leads 
to the expression 


v*/cycle = (2y?/T) { [d* + r* - cos(yt + 6,)]dt, 


= + r? + (ddr/7)sin 


For the special case where the initial velocities are zero in magnitude, 
the result simplifies to 


v*/cycle = 2r*y* = 2U?, 


= 
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4, Applications. 


Uniform crossed electric and magnetic fields have been employed 
in numerous electronic devices. A typical example is provided by the 
magnetic electron multiplier [1]. Electrons leaving a photocathode 
C are accelerated across an electric field towards the plate with P 
volts. A uniform magnetic field normal to the plane of the adjacent 
drawing turns the electrons in cycloidal paths to the next secondary 


emission target. The electron multiplier seeks to produce a large 
electron flow upon the collector for only a small initial emission of 
electrons from .a cathode which emits electrons when light falls upon 
its surface. When the electrons strike the next secondary emission 
plate, they are able to eject electrons from the surface, thereby 
multiplying the number of electrons several times. A similar field 
combination 1s employed in the television pickup tube, called the 
Orthicon [2]. The knowledge of the cycloidal paths and the corresponding 
velocities 1s necessary 1n obtaining the maximum performance for the 
equipment. 

In contrast to the previous example electron multiplication can 
be amost serious engineering problem. Positive ion sources are used 
in electromagnetic separation of isotopes. In order to accelerate 
positive ions a uniform electric field is used in the presence of 
a magnetic field. Stray electrons in these regions are able to multiply: 
exponentially as their path lengths increase. Usually the ability 
to lonize increases with path lengths, velocity, and gas pressure. 


BING 


Eventually these electrons strike more positive surfaces, usually the 
ion source mechanism itself, thereby causing serious heating, destruc- 
tion of metal, and a lowering of the positive charge on the unit. 
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The problem of the physicist is to construct the positive ion source 
such that the cycloidal path lengths and velocities of the electrons 
are kept at a minimum. 

In order to assist the fuller appreciation of the magnitudes involved 
in the cycloidal paths of the electrons, a typical example of electric 
and magnetic fields used for the electromagnetic separation of isotopes 
is offered. For convenience let %-* %y * and E. = 3000 volts/cm. = 
10 esu/em., E. = 300 volts/cm. = 1 esu/em., H = 3000 gauss, c #* 
3 + 10'° on./sec., e/m = 5.3'7+10'? esu/gram. From the results obtained 
in section 3, the following magnitudes for the characteristics of the 
cycloidal motion are derived: 


eH/2nmc = 8.4 10° cycles/sec., 


" 


V 


sec. /cycle, 


T = 27mc/eH 


U cE /H = 10° om./sec., 


(m/e) (c/H)*E, = 1.9 ems. 


For the special cases of negligible initial velocities with respect 
to the drift velocity U, one finds in addition that 


Ns 1.5 + 10°? cms. = .15mm., 


[v2 /eyele]’ = 1.4 10° cm./sec. 


The magnitude of r suggests that the fluctuation of the electron 
in the direction of the electric field due to the cycloidal motion 
is very small. For the example above F, = 3000 volts/cm.; thus a change 
of 1.9 cms. is equivalent to only 5.7 volts. These results 
lead to the approximate statement of the physicist that the electron 
moves along the equipotentials with a drift velocity U. This latter 
statement of course does not take into account the effects of the 
electric field parallel to the magnetic field or that the actual 
amplitude d 1s cetermined by the initial velocities te and 96 + 
It 1s quite possible in a physical problem to have d much greater 


than r. 

In regards to this latter point a final remark is worthwhile. When 
an electron has an 1onizing collision with a gas molecule, or when 
upon striking a metallic surface an electron is ejected, there is an 
energy transfer to the created electrons which gives them kinetic 
energies greater than 10° cm./sec. This initial velocity can be in 
any direction relative to the electric or magnetic fields. Thus it is 
not necessarily correct to assume that ip, and Ye are negligible with 
respect to U. To illustrate the relative importance of the initial 
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velocities a final example is illustrated. It 1s assumed that x = yp = 


x' = 0. As a consequence 6, = 0. Equations (2.8) and (2.9) simplify to 
x = rf + Yt, 
y = ay, + l)sin Yt - Ut, 


if + is expressed in units of U. The cycloidal path as a function 6 


of the magnitude of y 1s shown in figure l. 
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PROOF OF FERMAT'S LAST THEOREM FOR n = 2(8a + 1) 


Thomas Griselle 


THEOREM: No integral solution exists for 


(1) 


m being a prime integer ~ 2 and not a divisor of x, y, or z, unless 
mis of the form 8a + 1, 


PROOF: It suffices to assume that (1) has a primitive solution 
(one in which x, y, Zz are coprime), x being even, y and z being odd. 
We cannot assume that x and y are odd and that z is even, because the 
sum of two odd squares is not divisible by 4, as shown by the identity 


(2a + 1)? + (26 + 1)? = 2[2(a2 + a + 67 + b) + 1). 


Transposing the terms of (1) and factoring, we have 


It is obvious that the factor x* - y* is even. Having m odd tems, 
the second factor is odd. The G.C.D. of these two factors is m or l1*. 
Assuming that x 1s not divisible by m, it follows that neither factor 
1s divisible by m. The G.C.D. then is 1. Hence, each is an integral 
2mth power. Let 


( 2) 


k being a suitable odd integer. 
Being an odd square, each term of (2) is of the form 8a+ 1, as 


shown by the identity 


Thus, (2) 1s of the form 


(2a) B(a, + a, + +a) + me BA + 1, 


which, of course, cannot be satisfied unless m is of the form 8a + 1. 
Incidently, this argument proves that no primitive solution exists for 


2 


x + y2® = 2m 


x being even, y and z being odd, m being a prime integer ~ 2 and 
not a divisor of x, unless mis of the form 8a + l. 


*See pages 87 -88 of Diophantine Analysis, Carmichael. 
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A FORMULA FOR THE CALCULATION OF THE INERTIA MOMENT 
OF SOME GEOMETRICAL SOLIDS 


Christos VN. Kefalas 


The teaching in Colleges of the moment of inertia of the various 
solids, generally meets with difficulties in view of the necessity 
to use a specific formula in each particular case. The object of this 
study 1s to provide a standard formula covering several solids. 
Figure (1) represents a solid bounded by a surface, each plane 
section of which is parallel to the xy-plane and is at a distance z 
from it, and consists of regular N-gons of a radius R= f(z). 


Figure 1 


We will have for each section w = 7/N, a= 2R sin w, d= R cos w. 
Let d = the density. 
The moment of inertia of this body to the z-axis 1s given by 


(1) I = + y?)dxdydz = + y*) dxdy 
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and for the sake of symmetry, we will have 


T = (x2 + y?)dxdy = 2v(| (x? + y*)dxdy 
N 


- GON TRIANGLE OAB 


But y = x tan w, therefore 


Rcosw,n TAN w R cos w a 
T= dx (x? + y*)dy = (x3tan w + 


R4 l 


= 2N a sin w cos w (cos*w + 3 sin*w) 


h 
N 
and I = — D sin w cos w (cos*w + 3 sin*e)| R4 dz 


ie) 


Given (3) R” + bz™ = c, where b, c, m, and on are real numbers, we wll 
have 


I = D sin'w cos w (cos‘w + % sin‘w)| (ec - bz®™) dz 
0 
b 
For c #0, b #0 and |—h/ <1, according to the Rinomal theorem, we 
will have 
(5) I = 5 D sin w cos w (cos‘w + 3 sin“w)b [h() - 


b 


- 


m+] 


2m + 


When n = 1, 2, 4, then k = 4, 2, 1, and the above formula provides 
the inertia moment of some geometrical solids. 


2 
It 1s obvious that when N ~ %, 5 sin w cos w (cos*w + % sin*w) has 


7 for limit; actually, N sin w cos w = . sin 2w = * sin a and, according 
to Hospital’s formula (. 27 cos has 7 for limit when 
N2 N N 


and cos*w sin’w has unit for limit. 


Applications 


A) Voment of inertia, about a line. 
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1) Regular prism and right circular cylinder to their axes. 

In this case, relation (3) becomes R = r, b = 0, c = r, n= 1 and 
m= 0. 
Hence, for a regular prism we have 


= DN 


5 
— sin w * cos w(cos’ w + sin*w)r* h 


and for a right circular 


r“h, 


2 


2) A right pyramid and a right circular cone to their axes. 
In this case, relation (3) becomes 
h 


z r 
Ror h 

Yence for a right pyramid we have 


I = D-N sin w + cos w(cos*w + l sin’ w) (£ 


3 h 


or I = DN sin w cos w(cos*w += sin?wh-r 


10 3 
and for a right circular cone 


10 


3) Frustum of a regular pyramid and frustum of a right circular cone 
to their axes. 
In this case, relation (3) becomes 


and m= n= 1 
Hence for a frustum regular pyramid 


4 
h 
I = D-N sin w* cos wicos*” + sin*e| 2 *h-4-. 
2 2 


5 | 


2 2 4 
hsin w+cos +=sinw\(r +rret+r +¢rpr.+r_), or 
2 2 24 2 1 is 1 


10 


3 
4 
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r 
wt — sin*w) 2 
re 


sin w'cos w (cos¢ 


4) Sphere and lemisphere to their axes 
In this case, the relation (3) becomes 


R2 + 22 = r* i.e. b= randm=n= 2 


h 
Hence = =| —| and,’ as for the hemisphere 
2 i-2 5 


h= r, we have [ = 7G D-mr°, and, for the sake of symmetry, for Sphere 


8 5 
I = —Denr 
15 


5) Ellipsoid of revolution to its axis. 
In this case, relation (3) becomes 


Hence 


But Ah = B hence 


§) Paraboloid of revolution to its axis 
In this case relation (3) becomes 


.e. b= -p, c 


. 


2,3 


7) Hyperboloid of revolution, to its axis 
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10 

; and for frustum right circular cone 

I Den h rs 

10 

2 

or R =A 

| - B? 

4 

2 &B 3 5 
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In this case, relation (3) becomes 


2 


b 


B) A similar calculation should apply respectively to a plane and to 
a point. 


Athens, Greece. 


THE RULE OF DOUBLE FALSE 
Tarold E. Bowie 


My students became interested as to why the Rule of Double False 
used by Robert Recorde and other early mathematicians gave them correct 
answers. They found that arithmetic explanations were difficult because 
of their tendency to think about problems in the language of algebra. 
In our explanations which we developed by methods of the elementary 
algebra of today, we used a problem described by Vera Sanford in her 
book A Short History of Mathematics, * 
“This particular problem and its solution are from Robert Recorde’s 


Ground of Artes (c. 1542). 


One man said to another, I think you had this year two thousand 
Lambes: so had I said to the other; but what with paying the tythe 
of them, and then the several losses they are much abated: for at 
one time I lost half as many as I have now left, and at another time 
the third part of so many, and the third time 7 so many. Now guesse 
you how many are left. 


It was clear that after the tithe was deducted, 1800 lambs were 
left. If the man had had 12 at the end, he would have had 12 + 5 + 4 
+ 3 or 25 at the beginning. This is 17/75 too few. On the other hand, 
if he had 24 at the end, he would have had 24 + 12 + 8 + 6 or 50 at 
the beginning which again is 1750 too few. The guesses and errors 
are then written down and the cross-products are found by multiplying 
along the guide lines thus: 


R?2 A2 
A B B 
2 2 
1.e. b= = -B and m =n = 2, 
B2 ’ 
1775 1750 
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Then the difference of these products is divided by the difference 
of the guesses and the quotient is the required number. In this case 
it 1s 


42500 - 21000 21600 
1775 = 1750 25 


864 


Our explanation follows: 
Let x = the number left. 
Then 


(1) 


(2) 12+6§ +4+ 38 
(3) 24+ 12 + 8 + § 
Subtracting (2) from (1) we have 
(4) (x - 12)(1 + 1/2 + 1/3 + 1/4) 
Subtracting (3) from (1) we have 
(5) (x - 24)(1 + 1/2 + 1/3 + 1/4) 
Dividing (4) by (5) we obtain (as x # 24) 
x — 1775 
x - 24 1750 
1750x - 12.170 = 1775x - 24.1775 
x(1775 - 1750) = 24.1775 - 12.1750 
24.1775 12.1750 


1775 - 1750 


Which justifies the Rule of Double False as the method of explanation 
is evidently perfectly general. 

In the same volume by Miss Sanford we find a problem illustrating 
the Rule of False Position. 


“To solve the equation x + 1/7x = 19, the unknow number x is assumed 
to be 7. Then the sum of the number and its seventh part will be 8, 
and the number of the equation is the same multiple of 7 that 19 is 
of the guessed number 8.” 


We give an explanation of this simpler rule for the general case. 


(1) 19 


. 
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Let x = K be our guess 
Then 


(2) K+ 1/7TK=C 


From (1) 


(1 + 1/7) = 19 


From (2) 
K(1 + 1/7) = C 


Dividing with K # 0 we obtain 


Boston, (1930), 


*See A Short History of Mathematics by Vera Sanford, 
page 160. 


American International College 


THEOREM: Every integer greater than 2 is a member of a Pythagorean 
triplet; every integer of the form (2n + 1) is a member of a primitive 
Pythagorean triplet. 

(Note: For convenience, when the word, “integer’’, 1s used in the 
following, it refers to an integer greater than 2.) 

The integers, a, 6, c, form a Pythagorean triplet when they satisfy 


the equation: 


gta 


When they have no common divisor, the triplet is primitive. 
The equatia 


+ 2ab + b* =(a + b)? 


can be reduced to a Pythagorean triplet whenever (2ab * b7) is a 
rational square. Members of such a triplet are a, (2ab + b7)4, (a + b). 

Every integer is of the form (2n + 1) or (2n + 2) and has a square 
of similar form, with the square of (2n + 2) being also of the form 
(4n + 4). As a consequence, the square of any integer is of the form 
(2ab + b*) and the integer is, therefore, a member of a Pythagorean 


triplet, consisting of: 


4(b* - 1), by 4(b* - 1) + 1, where bis of form (2n + 1) 


71 
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4(b? - 4), b, “(b2 - 4) + 2, where b is cf form (2n + 2) 


If bis of the form (2n +1), the triplet can be expressed in terms 
of a as follows: a, (2a + 1)", (a + 1). As a, (2a + 1), and (a+ 1) are 
prime to one another, (2a + 1)* is prime to a and to (a+ 1). 


James E. Foster 


MATHEMATICS CONTEST 


Sponsored by the Euclid Circle of the College of Saint Pose. 


Letters explaining the purpose of the contest (that 1s, to engender a 
keener interest in mathematics), registration blanks, and rules for the 
contest were sent to approximately 65 schools in the capitol area near 
Albany, V.Y. They were sent to the principal with attention line 
addressed to the chairman of the mathematics department. The registration 
blank was sent back within a month to give us an idea of the number of 
prospective entries. 

According to the rules, any original geometric design would be 
accepted. 9" x12" paper was the largest; color, pen, pencil, paint were 
left to the students discretion. 

Prizes were: lst $10; 2nd $5 and the judges named 10 honorable 
mentions — these were given cards naming their distinction, contest 
name and date, and their name. 

Our response was approximately 180 entries from 16 schools. 

The judges were: Chairman of the Board of Regents’ Department of 
Mathematics of N.Y. State; Chairman of the Mathematics Department at 
the V.Y. State College of Teachers in Albany; Professor of Art at the 
College of Saint Pose. 

The first prize winning entry was in perspective and showed a complete 
range of mathematics. That is, the many “ figures”, the circle, point, 
straight line, curve, triangle, sphere, pyramid, cube, etc., were 
blended into the design. It was shaded in black, gray, and white. 

Second prize appeared to be rather modern. There was a pillar-like 
design to one side and beside it a sphere with a pyramid cut in. This 
was in black, red, and white. 

These two designs were rather intricate as well as neat. Others 
ranged from a curtain design pattern to perspective modern drawings. 


Reported by Qharlene Lysick, Albany, N.Y. 
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CURRENT PAPERS AND BOOKS 


Edited by 


This department will present comments on papers previously published in the 
MATHEMATICS MAGAZINE. lists of new books, and book reviews. 

In order that errors may be corrected, results extended, and interesting 
aspects further illuminated, comments on published papers in all departments 
are invited. 

Communications intended for this department should be sent in duplicate 
to H. V. Craig, Department of Applied Mathematics, University of Texas, Austin 
12, Texas. 


Mr. B, E, Mitchell 
Alabama Polytechnic Institute 


Dear Sir: I am interested in Mathematics as a recreation, and I came 
across your article in the Jan.-Feb. number of “Mathematics Magazine.” 

Pharmacists are frequently compelled to solve problems of the type 
you present in filling the car radiator. 

An old pharmacist showed me a method which, while largely “mechanical” 
in nature, expedites and simplifies these problems. I have since faind 
the method described in some books on pharmacy under the title “ ALLE- 
GATION.”’ 

The problem you present would be solved by “ ALLEGATION” in this 


manner. 


90% 24 parts 


18% 48 parts 


The higher percent 1s placed above the lower. The desired percent 
is placed in the middle. Subtractions are made as indicated by the 


arrows. 
The final product to obtain a mixture 42% alcohol is thus 


24 parts 90% 
48 parts 18% 


or a ratio of 24:48 = 1:2. 
Since the total is 3 parts = 21 qts., 1/3 must be 90% = 7 qts.—must 
be drained. 2/3 mst be 18% = 14 qts. 


Extending the application of the principal: 
Example: In what proportion mst 8%, 10%, 16%, 18% solutions be 


mixed to give a 14% solution? 
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Procedure: 

(1) Write the percentages in diminishing percent in the left 
column. 

(2) Write the final (desired) percent in the middle. 

(3) Connect each couptet consisting of a higher and lower per- 
cent and perform the subtraction as in the previous prob- 
lem. 

(Note: The mltiplicity of original solutions permits several 
procedures each yielding the same final result. Below are 
illustrations of two possible procedures. ) 


18. _> 4 parts of 18 % 


14 ——+> 6 parts of 15 % 


~ 


10 ~ ibey' parts of 10 % 


parts of 18 
parts of 15 % | 


14 % 


10— parts of 10 
sol. 


-—>8 parts of 8 


Another way in which this may be set down 1s as bel ow: 


§ parts of 18 % 


4 parts of 16 % 
10 — 2 parts of 10 % 


—>8.. 4 parts of 8 % 


In regards to your description of the “French Vethod of Long Divis- 
ion” I am unable to follow you in your description. Will] you favor me 
with a more detailed description. I “bog down” at the following in- 
dicated points: 


107 


236 / 25455 
18 56 
204 
You say: (uest ions 


“ . ” Wh add 8? 
1x6 is § and 8 makes 14... y 
pedi — { Just where does the 14 come in? 


lve 
sol. 
_ 
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Write 8 where ? 
Carry the 1 where ? 


Write 8 and carry one...” 


(From here on it just seems you grab figures out of thin air and I am 
completely lost in a Minoan Mathematical Maze.) 

T would deeply appreciate clarification. You will forgive my native 
ineptitude for figures. 
Thanking you for any courtesies extended to me, I am, 


Sincerely yours, 


Arithmetic vs Algebra 


The Mitchell article on page 153 of the Jan.-Feb. number speaks 
of the subtraction of 1000 d = 123.45 from 10 000 d = 12345.45 and 
subsequent division of the resulting equation by 9000 as an “ arithmetic 
method”, in contrast apparently to an “ algebraic method”. 

This raises the question: how does the arithmetic method of attacking 
a problem differ from that of algebra? Is it not essentially as follows: 

1. Arithmetic takes the given numbers and operates with them, ob- 
taining interpretable sums, products, etc., until the desired number 


is reached. 

2. Algebra states the relations between the numbers with which the 
problem is concerned (supplying letters for numbers not given) and 
operates on the resulting equations in such a way that some equation 
is found to give the desired unknown explicitly. 

May we illustrate the two methods by this simple problem: the sum 
of two numbers is 18, and 2 times one of them is the other. 

1. By arithmetic. The 2 times and the one make 3. So 18 is 3 times 
the one. Then 18 + 3 = 6, and 2 x § = 12. So the numbers are 6 and 12. 

2. By algebra. The problem says that 18 = a + b (a and b are the 
two numbers), and that a = 2b. By substitution, 18 = 2b + b, so 3b = 18, 
b= 6, anda = 12. 

We must remark that those who teach algebra to beginners usually 
obscure the strictly algebraic method by doing such a problem partly 
by arithmetic, saying at once that 2b + b = 18, which is a relation 
not explicitly given in the wording. Such obscuration commonly results 
from the (regrettable!) attempt to “ solve by one unknown” a problem 
in which there is more than one unknown number mentioned or implied 


in the problem. 


William R. Ransom, Tufts College 
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Analytic Geometry and Calculus. Frederick 4. Willer. xii plus 
658 pp. $5.00. John Waley & Sons, Inc. New York, 1949. 


This text is suitable for a two or three semester course. It corre- 
lates the subjects of plane and solid analytic geometry, differential 
calculus, and integral calculus, and presents the fundamentals of 
calculus early enough to be of use in other subjects; this early presen- 
tation 1s of particular advantage in engineering and other science 


courses. 

Among the outstanding features of the book the reviewer notes the 
more than usually comprehensive study of maxima and minima; the careful 
motivation of the study of conic sections with emphasis on their defini- 
tion in terms of eccentricity; detailed consideration of graphs of 
equations in polar coordinates; special treatment of the limit of 
(sin x)/x as x approaches zero and the limit which serves as Vapierian 
base; a treatment of integration in accord with the analytical method 
of more advanced analysis; an appendix including useful symbols and 
tables, essential definitions amd facts of more elementary mathematics; 
3025 exercises together with answers to the odd-numbered exercises; 
chapter summaries useful both to the teacher as an outline for his 
course and to the student as a basis for review. 

The author presents specific topics much as he has presented them 
in his other textbooks. The treatment is sufficiently rigorous to 
serve the student whose primary interest 1s in pure mathematics and 
at the same time it meets the need of the student who wishes to use 
mathematics in Science or engineering. The teacher will welcome the 
clear exposition and will find among the numerous problems many suit- 
able for the average student and others which should stimulate the 
interest and effort of the abler members of his class. 


Helen G. Russel] 


Vorlesungen uber Differential-und Integralrechnung, Zweiter Band. 
By A. Ostrowski, Pirkhauser, Basel, 1951, 482 pp. 57 Swiss fr. 


This volume is a rigorous, detailed, and lucid treatment of important 
topics in the field of the differential calculus of several variables. 
Among the topics treated we find: infinite sets, functions on sets, 
infinite sequences and series, differentiation of functions of several 
variables, implicit function theory, numerical approximation methods, 
vector algebra and differential calculus (in Gbb’s notation) and di ffer- 
ential geometry of curves and surfaces. This easy reading and elegant 
work should be a valuable reference volume for calculus instructors. 


Homer V. Craig 
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PROBLEMS AND QUESTIONS 


Edited by 


C. W. Trigg, Los Angeles City College 


Readers of this department are invited to submit for solution problems 
believed to be new and subject-matter questions that may arise in study, in 
research, or in extra-academic situations. Proposals should be accompanied 
by solutions, when available, and by such information as will assist the 
editor. Crdinarily, problems in well-known textbooks should not be submitted. 
Solutions should be submitted on separate, signed sheets. Figures should 
be drawn in India ink and twice the size desired for reproduction. Readers 
are invited to offer heuristic discussions in addition to formal solutions. 
Send all communications for this department to R.E. Horton, Los Angeles 
City College, 855 N. Vermont Ave., Los Angeles 29, California. 


PROPOSALS 


168. Proposed by Frank C, Gentry, University of New Mexico. 

If P is any point in the plane of a triangle ABC and if P’, P” and P” 
are the harmonic conjugates of P with respect to ABC, then the perpen- 
diculars let fall from P’, P” and P” on BC, CA and AB respectively 


are concurrent when P is at the circumcenter O, the centroid G, or the 


symmedian point K, 


169. Proposed by Norman Anning, University of Michigan. 


One student tries to adjust k to make 4x + y = 24 a tangent to 
y= kx? and gets one answer. Another tries to make the same line a 
tangent to x? = ky and gets two answers. Who is right and what is wrong? 


170. Proposed by R. E. Horton, Lackland Air Force Base, Texas. 


A fighter plane flies under the following conditions: 

Fuel Consumption: 100 gallons per hour. 

Fuel Capacity: Main tank holds 250 gallons. 

Two auxiliary wing tanks hold 100 gallons each. 

Cruising air speed: Initial air speed is 400 miles per hour. 

Air speed increases .125 mph per gallon of fuel consumed. 
Air speed increases 5% when wing tanks are dropped simultaneously. 

1) What is the effective range of the plane if a 20% reserve of fuel 
must be kept and the wing tanks are dropped simultaneously when both 
are empty? 

2) What is the effective radius of action North with a wind of 30 mph 
from the South, assuming a fuel reserve of 20% and wing tanks dropped 
simultaneously when both are empty? (Radius of action is the distance 
a plane can fly and still return to its base). 

3) Under what wind conditions will the time out exactly equal the time 

back on a radius of action problem for this plane? 
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171. Proposed by N, A. Court, University of Oklahoma. 


The four spheres having for great circles the polar circles of an 
orthocentric tetrahedron (7) have for orthogonal center the orthocenter 
of the medial tetrahedron of (T). 


172. Proposed by H, E, Fettis, Wright-Patterson Air Force Base, Dayton, 
Ohio. 


Fvaluate: lim [72cesc27a - 1/a?]. 


173. Proposed by F. J. Duarte, Caracas, Venezuela. 


Prove that (1) the equation x? - 6abx - 3ab(a + b) = 0 has no solution 
in integers; (2) the equation 9x> — 6abx — 3ab(a + b) = 0 has an infinite 


number of integer solutions. 
174. Proposed by J. E,. Foster, Evanston, Illinois. 


The equation, 2? + 1 = 3p’, where p and p’ are odd primes, has 
been empirically confirmed for values of p through 17, as follows: 


p 3 5 7 11 13 17 
p’ 11 43 683 2731 13691 


Are there solutions of the equation for p > 17? 
SOLUTIONS 


Late Solutions 


139. M. S. Klamkin, Polytechnic Institute of Brooklyn, N. Y. 
145. Robert Bonic, University of Chicago, Illinois, 


120° Triangles with Integer Sides 
147. [November 1952] Proposed by Leon Bankoff, Los Angeles, California. 


In a triangle with integer sides, the side opposite the 120° angle 
is 1729. Find all possible values of the pair of other sides. 

Solution by E, P, Starke, Rutgers University. By the law of cosines 
the side opposite the 120° angle is (a* + ab + b*)”2 = 1729, where 
a, b are the other sides. We have 


a? + ab + b* = (1729)? = 72+ 13? - 197, 
Now numbers of the form x* + xy + y* are multiplicative, i.e. the 
product of two is a third of the same form. The relation is 


+ ab + + cd + d*) 
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r = ac + bd + ad, s = be - ad, (A) 


and a second choice ofr, s which results from interchanging a, 6 
and/or interchanging c, d, but so that be - ad 2 0. 

For convenience write a* + ab + 67 = (a, 6). Then 7 = (2, 1), 
13 = (3,1), 19 = (3,2). Using (A) we get in succession: 


72 =(7,0) = (5,3), 137 = (13,0) = (8,7), 197 = (19,0) = (16,5), 


77-137 = (91,0) = (56,49), (65,39), (85,11) = (80,19), whence 
727-132-192 = (17290), (1064,931) = (1235,741) = (1615, 209) 
= (1520, 361) = (1456, 455) = (1309, 651) = (1421, 504) = (1144, 845) 


(1560, 299) = (1305, 656) = (1591, 249) = (1679, 96) = (1185,799). 
2 


Conversely it can be shown that if a product M*N = x° + xy + a 
then there exist non-negative integers a,b,c,d such that M=a*+ab+ 4*. 
N= c* + cd + d*, x = ac + bd + ad, y = be - ad 2 0. Thus the thirteen 
triangles found above, omitting (1729,0), are the only ones possible. 


Also solved by Ward Bouwsma, Calvin College, Grand Rapids, Michigan; 
Sam Kravitz, East Cleveland, Ohio; and the proposer. Partially solved 
by G, G, Becknell, University of Tampa, Florida; M. S. Klamkin, Poly- 
technic Institute of Brooklyn; and Prasert Na Nagara, College of 
Agriculture, Thailand, 

For other discussions dealing with triangles having integer sides 
and one angle of 120°, see L’Arith. de S. Stevin, par A. Girard, Leide, 
(1625), 676; Les Oeuvres Math. de S, Stevin, par A. Girard, (1634), 
169; H. Bottcher, Unterrichtsblatter fur Math. u. Naturwiss., 19, 
132-3, (1913); American Mathematical Monthly, 44, 113, (1937); 48, 
707, (1941). 


A “Lewis Carroll” Pillow Problem 


148. [Nov. 1952] Proposed by D, L. MacKay, Manchester Depot, Vt. 
Upon the sides of triangle ABC the squares ABDE, BCFG, ACHL are 


constructed exterior to the triangle. Construct triangle ABC given 
the A‘, B’, C' which are the intersections of DE and HL, ED and FG, 


GF and LH, respectively. 


I, Solution by Prasert Na Nagara, College of Agriculture, Thailand, 
Produce A'B' to S making B'S = A'B' and A'C' to U making C'U = A'C’. 
Construct the circle (T) through S, B’, C’ and the circle (V) through 
B', C', U. To (T) and (V) at B’ and C’ respectively, construct tangents 
which will intersect at O. Let P, Q, R be the feet of the perpendiculars 
from 0 to B'C’, C'A', A‘'B’. Construct d, the fourth proportional to 


(OP + B'C'), B'C', and OP. At distance d from B'C’ draw a parallel 
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meeting B’O and C’O in B and C. Through C draw a parallel to A’C' meet- 
ing OA’ in A. 

_ Proof. Let the feet of the perpendiculars from A, B to A'B' be 
E, D; of those from B, C to B'C' be G, F; and of those from C, A to 
C'A' be H, L. 


OP sin OB'P sin % C' TB’ 


sin sin SIR’ 


Similarly, OP/OQ = a'/b’. 
Now d = a’ (OP)/(OP + a‘), so 


a 
c 


ilence BC = d. 
a'/(OP + a‘) = 1/(1 + OP/a') = 1/(1 + OR/c') = c'/(OR + c'). 
BD/OR = B'B/B'O = d/OP = c'/(OR + c'). 
AB/c' = OB/OB’ = 1 - d/OP = 1 - c'/(OR + c') = OR/(OR + c'). 
Hence AB = c‘(OR)/(OR + c‘) = BD. 
oC/0C' = Bc/a' = OB/OB' = AB/c' = OA/OA', so CA is parallel to b’. 
It follows that AC = AL. 


II. Solution by John Jones, Jr., University of North Carolina. 
The triangles ABC and A‘B'C’ are homothetic, so AA’, BB’, CC’ intersect 
in the homothetic center Q. Now the distance from A to A‘C' is equal 
to b and the distance from A to A‘B' is equal to c. Hence a point on 
AA' may be constructed by drawing a line parallel to A‘B’ at a distance 
A'B' from it, and a line parallel to A’C’ at a distance A’C’ from it. 
Thus AA’, and in like manner, BB’ may be determined. So Q, the inter- 
section of the symmedians AA’ and BB’, is the symmedian point of the 
triangles. Inscribe a square in triangle A’QB’ with one side lying 
on A‘B’. The other two corners of the square will determine vertices 
A and B, on A’Q and QB’, respectively. Through A and B draw parallels 
to A’C’ and B'C’, respectively. These parallels will intersect in C. 


III. Solution by Ward Bouwsma, Calvin College, Grand Rapids, 
Michigan. Upon the sides of triangle A‘B'C’ construct squares A‘B'D'E’, 
B'C'F'G', A'C'H'L'. Then the intersections of D'E' and H'L'’, E'D' and 
F'G', G'F' and L'H' determine vertices A’, B’, C’ of a triangle similar 
to A”B"C" and hence to ABC. From A’ lay off on A‘B’ the fourth pro- 
portional, to A"B", A"E', and A‘B’. At E, draw a perpendicular 
into triangle A’B’C’. On this perpendicular, from E lay off EA, the 
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fourth proportional to A”B”, E'A' and A‘B’. In like manner, B and C 
can be located. 


IV. Solution by Leon Bankoff, Los Angeles, California. Draw 
triangle A”B"C” as in III. Draw the symmedians A”A’, B"B', C"C' which 
intersect in Q. Connect Q to D’, E', L', H', F’, G'. These connectors 
cut the sides of triangle A'B’'C' in D, E, L, H, F, G respectively. 
perpendiculars erected at these points to the sides of A’B'C’ intersect, 
by twos on the symmedians, in A, B, C. The validity of this construction 
follows from well-known properties of homothetic figures. 


The problem itself has always aroused interest. Five solvers success- 
fully attacked it for the right triangle in Question 1615 of the 
Ladies Diary of 1838. Charles L. Dodgson gives a geometric and a 
trigonometric solution as Problem 57 in Pillow Problems, 3rd Edition, 
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Macmillan, London, 1894. Most modern geometries deal with it or its 
essentials, for example, Altshiller- Court, College Geometry, Johnson 
(1925), page 232. 

Also solved by Prasert Na Nagara, College of Agriculture, Thailand 
(a second solution); Charles Salkind, Polytechnic Institute of Brooklyn; 
A, Sisk, Maryville College; and the proposer who remarked that the 
problem is based on the theorem regarding the concurrency of AA’, 
BB', OC' given by E. W. Grebe in 1847. It is the basis of the German 
claim that the symmedian point should be named Grebe’s point. 


A Gambling Game 


149. [November 1952] Proposed by L. C. May, John Muir College, Pasadena, 
California, 


A has a gambling device so arranged that he always wins 3 times out 
of every sequence of 5 plays. B suspects that the game is “ fixed” and 
demands that A always wager half of his resources, which B will match 
with an equal amount. B’s funds are assumed to be unlimited. 

(1) Show that A always loses over a sequence, regardless of the order 
of his 3 winning and 2 losing plays. 

(2) Determine the best fixed percentage of his resources for A to 
wager if he wins 3 out of every 5 plays. 

(3) If A must always wager 1/2 of his resources, determine to the 
nearest integer the number of times he must win out of a sequence of 
one hundred plays in order to break even. 


Solution by H. M. Gehman, University of Buffalo, New York. 

(1) If A always wagers half his resources, when he wins his resources 
are multiplied by 3/2 and when he loses they are multiplied by 1/2. 
Thus a sequence of 3 winning and 2 losing plays in any order multiplies 
A’s resources by (3/2)? (1/2)? = 27/32, which is to his disadvantage. 

(2) If A always wagers x of his resources, when he wins his resources 
are multiplied by (1+ x) and when he loses they are multiplied by 
(1 - x). Thus a sequence of 3 winning and 2 losing plays multiplies 
A’s resources by (1+ x)3(1 - x)*. This function has a maximum when 
x = 1/5. Hence if A always wagers 20% of his resources, each sequence 
of 5 plays multiplies his resources by 3456/3125 = 1.10592. This is 
A’s most advantageous choice of the percent he should wager. 


(3) If A wagers half of his resources and wins t times in 100 plays, 
1 
in order to break even t must satisfy the equation: (3/2)* (1/2) a =e 


or 3* = 2'°° The nearest integral root of this equation is 63. But if A 
wins only 63 times his resources will be decreased; if he wins 64 
times they will be increased. 


Also solved by Robert Bonic, University of Chicago; J. M. Howell, 
Los Angeles City College; Sam Kravitz, East Cleveland, Ohio; Prasert 
Na Nagara, College of Agriculture, Thailand; and the proposer. 


282 
206 
l 
a 
a 
| 


PROBLEMS AND QUESTIONS 


A Parallelogram Associated with the Triangle 


150. [November 1952] Proposed by P. D. Thomas, U. S. Coast and Geodetic 
Survey. 

The lines joining the vertices of a triangle to the internal points 
of contact of the escribed circles meet in a point. The perpendiculars 
upon the sides of the triangle from the excenters meet in a point. 
Show that these two points together with the orthocenter and the in- 
center of the triangle are the vertices of a parallelogram. 


Solution by Leon Bankoff, Los Angeles, California. Using the con- 
ventional notation, H for orthocenter, J for incenter and O for circum- 
center, let N be the Nagel point (the intersection of the lines joining 
the vertices to the internal points of contact of the escribed circles), 
and call the intersection of the perpendiculars upon the sides of the 
triangle from the excenters, P. Denote the midpoint of HN by U, the 
centroid by M, the center of the nine-point circle by F, and the center 
of the Spieker circle by S. 

IO is parallel to HN, and IO = HN/2 (Johnson, Modern Geometry, 
(1929), page 226). Now O is the midpoint of IP (Altshiller-Court, 
College Geometry, (1950), page 105). Hence HN = IP, and HIPN is a 
parallelogram. 

Furthermore, S is the midpoint of IN (Johnson, page 226), so IN, HP 
and OU bisect each other in S. Also, HO, the Euler line, is bisected by 
F, which is also the midpoint of IU. Now M trisects HO and hence OF 
and IN, so M is the centroid of JOU. Again, HN is a diameter of the 
Fuhrmann circle (Johnson, page 228), so U is the circumcenter of the 
Fuhrmann triangle. 


Also solved by the proposer. 
A 


=) 
> 
. 
| H 


284 MATHEMATICS MAGAZINE (May-June 


The General Term of a Sequence 


151. {November 1952] Proposed by Dewey Duncan, East Los Angeles Junior 
College. 


In a recent text on Backgrounds for Secondary Mathematics Teachers 
the following statement appears: “Consider the sequence 1, 2, 3, 2 1/2, 
2, 11/2, 1 3/4, 2, 2 1/4, 2 1/8, +++ . It has the limit 2, as can be 


shown by locating the values on a number scale. There 1s no single 
formula for this sequence.” Refute this last assertion by example. 


I. Discussion by M. S. Klamkin, Polytechnic Institute of Brooklyn. 
This problem illustrates an error found in many elementary texts dealing 
with sequences and in many Civil Service examinations and intelligence 
tests. One cannot say anything about the next term of a sequence nor 
about its limit, if only a finite number of terms is given. The sequence 
is not uniquely determined unless the law of formation is stated or 
a general term is specified. Thus there is an infinity of formulas 
applicable to the given sequence. Furthermore, the limit is not nec- 
essarily 2, it can be any arbitrary number. Also, the series may diverge 
or even oscillate, depending upon the law of formation. 

The fundamental principle involved is clearly shown by the following 
more general problem. Let us fit an n-th term formula to a sequence 


of which any r terms are given. Let Sage Mage O°" a, be the m, no,°" 


‘+, m _4, given terms of the sequence. A general formula which fits 


this sequence is 
a, pln, ) 
a = % 
cyclic p(n) (ny -n,)(n,y 


2, 


1 


where P(n,) # 0, k= 1, 2, ***, r, and otherwise g{n) is arbitrary. 


If we take f(n) of order n”*'then a, approaches a limit. If we 
1 


take $(n) of order less than n"™*" then a approaches ®. Finally, if 


we take $(n) to be oscillatory, then a, oscillates, e.g., if P(n) = 
n 
(-1) . 


Editorial Note: The author of the text in question has written 
that he used “ single’ as synonymous to “ simple” in this situation. 
Those who submitted formulas accepted the limit 2 as well as the most 
obvious law of formation. Almost every form received was different, 
however upon simplification they fell into typical patterns which 
follow. 

II. Solution by Harry M. Gehman, University of Buffalo, New York. 
We shall use a heuristic methoc to find a formula for a,, the n-th 


term of the given sequence (A). 
By subtracting 2, the limit of (A), from each term, we can see 


how the terms of (A) oscillate about the limit. This gives us the 
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sequence: 


i, G,. =1/6, 8, 144, 148, (B) 


The terms of (B) may be grouped into sets of three with a common 
denominator 2', where t = [(n - 1)/3] and [x] denotes the largest 
integer $ x. Multiplying each term of (B) by 2 results in the sequence: 


The periodicity of (C) suggests some type of sine curve. A little 
experimentation shows that the terms of (C) are given by the formula: 
(2/V3) sin(n - 2)7/3. 

Taking into account the operations which led to (C), we find that 
each term of sequence (A) is given by the formula: 


in(n - 2)7/3 
‘ee 2 Inf , where u = [(n - 4)/3] and [x] denotes the 


largest integer ¢ x. This may also be.written in the form 


2 (1/2) {(n- 0/3] (9/73) sin(n + 1)77/3. 


Others who submitted results essentially in this form are H. 4. Gould, 
Portsmouth, Va.; J. D, E. Konhauser, State College, Pa.; William Leong, 
Student, University of California at Berkeley; H. C. Parrish, North 
Texas State College; and the proposer. 

III. Form essentially submitted by Arthur Gregory, Albuquerque, 
New Mexico; Prasert Na Nagara, College of Agriculture, Thailand; and 
Mason Phelps, Student, Harvard University. | 

73] 

k=1 2 
IV. Second form submitted by H. W. Gould, Portsmouth, Va, 


a, = 2- {(-1) , (-1) [n73]} 


V. Form essentially submitted by Fred Marer, Los Angeles City 
College; Charles Salkind, Polytechnic Institute of Brooklyn; and L. A. 
Ringenberg, Eastern Illinois State College. 


= 2+ (n- 2- | 


VI. Second and third forms submitted by William Leong, Student, 
University of California at Berkeley. 


a, = 2+ {3[(n- 19/3] - n+ 20-1) 


[(n-1)/3] 
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2 1 
2° sin (n + 1)/3} , where 


1 2 
(-1)"sin mn + 1)/3}. 
The last formula has the advantage that it contains no symbols that 
would be mysterious to one who has been exposed to just high school 
mathematics. 

VII. Form submitted by H. F, Fehr, Teachers College, Columbia 
University, N.Y. 


a =2-{ 


2 
n V3 
p = {2(n - 1)sin 27n/3 + (n - 3)tan 2(n - 1)7/3}/3V3. 


sin (n + 1) 7/3} E , where 


VIII. Solution by Vern Hoggatt, Oregon State College and E. G, 
Goman, College of Puget Sound, 


n-1 n-3 
3 


3 n n 
a, = = > 
=o k=0 


n 
53k 
k=o 


where the 5! are Kronecker Deltas and are equal to 1 or 0 according as 


J 
i=jorif7yj. 


A Faulty Evaluation 


152. [November 1952] Proposed by Malcolm Robertson, Rutgers Univer- 
sity. 
In finding the area, A= mab, of the ellipse p* = a*b*/(a*sin70 + 
b*cos*@) a student gets an incorrect answer as follows: 


2 ab? a sec’6 dé 


2 b2 + (a tan 6)? 


a 
i. (arctan 0 - arctan 0) = 0. 


0 


Detect and explain the source of error. 


Solution by Charles Salkind, Polytechnic Institute of Brooklyn, N.Y. 


The student made the not uncommon error of integrating over essential 


di 


WI 


Al 
N.. 


= 
di: 
pre 
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discontinuities at 7/2 and 37/2. All would have been well if he had 
written 


4 ab? a sec’6 dé ’ 
2 b? + (a tan @)? 


n/2 
l 2ab* 
arctan tan = (7/2 - 0) = mab. 
0 


Also solved by Louis Berkofsky, Lexington, Mass.; Arthur Gregory, 
Albuquerque, N. Mex.; M. S. Klamkin, Polytechnic Institute of Brooklyn, 
N.Y.; Prasert Na Nagara, College of Agriculture, Thailand. 


QUICKIES 


From time to time this department will publish problems which may be 
solved by laborious methods, but which with the proper insight may be 
disposed of with dispatch. Readers are urged to submit their favorite 
problems of this type, together with the elegant solution and the source, 
if known. 


Q 28. (March 1951] J. M. Howell offers an alternate method of simpli- 
fication: (27 + 8i)/(3 + 2i3) = (27 + 8i9)/(3 + 213) = 9 - 613 + 4i° 
=5 + 6i. 


“@ 57. [March 1952] Prove that the derivative of an even function is 
odd and vice versa.” M. S. Klamkin offers this alternate solution: 
E(x) = [E(x) + E(-x)]/2 so E'(x) = [E'(x) - E'(-x)/2 = odd. Also, 
Ox) = [O(x) - O(-x)]/2 so O'(x) = [0'(x) + O'(-x)]/2 = even. [In this 
proof, the notation N‘(y) means d[{N(y)]/dy. ] 


Q 88. Given f(x) = “" + a. + Py + +++ +1, show that f(2i) = 0 (mod 9). 


[Submitted by T. C. Wilderman.] 


Q 89. A circle of radius 15 intersects another circle, radius 20, at 
right angles. What is the difference of areas of the non-overlapping 
portions? [Joseph Kennedy in School Science and Mathematics, 52, 162, 
February 1952.] 


Q 90. Prove that the sum of the vectors from the center, O, of a regular 
n-gon to its vertices is zero. [Submitted by Richard Couchman.] 


Q 91. Prove that log. 2 is irrational, without assuming a knowledge 
that any k-th root of 10 is irrational. [Submitted by M. P. Fobes.] 


Q 92. If p is an odd prime, all quadratic residues of p are congruent 
to 1?, 27, «++ | [(p - 1)/2]? modulo p. For what values of p are the 
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| 
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quadratic residues equal to 1)/2]?? [Submitted 


by V. C. Harris.] 
q@ 93. If (1 + x)" = @ + ax + ayx 
k k k i ‘ 
see + (-1) a” = 0. [Submitted by B. K. Gold.] 


«es qx” and n is odd, then 


+ 
ay a, a, 


Q@ 94. Find three integers whose sum is 117 and whose squares are in 
arithmetic progression. [Edwin Tabor in THE BAT, No. 47, page 326, 


November 1947.) 
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TRICKIES 


A trickie is a problem whose solution depends upon the perception of 
the key word, phrase or idea rather than upon a mathematical routine. 
Send us your favorite trickies. 


T 6. In a bureau drawer in a dark room there are 26 grey and 26 blue 
socks. How many socks must a man take in order to be sure of having a 
pair of the same color? [Submitted by R, E. Winger. ] 


T7. I invite you to play the following card game: Shuffle an ordinary 
deck of cards, and turn them face up in pairs. If both cards of a 
pair are black, you get them. If both are red, I get them, and if one 
is red and one black the pair belongs to neither of us. You pay one 
dollar for the privilege of playing the game. When the game is over 
you pay nothing if you have at most the same number of cards as I, 
and for every card that you have more than I, I will pay you three 
dollars. Would you care to play with me? [Submitted by Leo Moser.] 


T 8. If you traveled from one town to another at 30 m.p.h., at what 
speed must you travel on the return trip in order to average 60 m.p-h. 
for the entire journey? [Submitted by J. M. Howell) 


T9. Weary Willie went to the zoo to feed the elephants. "uying a bag 
of peanuts and wanting to treat each and every elephant alike, he took 
out his notebook and did a little figuring. He found that if he gave 
every elephant 7 peanuts, he'd have § peanuts left over. On the other 
hand, if he fed every elephant 9 peanuts, there’d be 2 peanuts over. 
How many peanuts should weary Willie have giveneach of these elephants 
to come out even? [Monte Dernham in THE BAT, No. 66, page 502, June 


1949, ] 


SOLUTIONS 
*sqynueed 02 
pue squeydaya useq ysnw ‘sqnueed AT Aq 
ayy seonper synueed queYydeye Fy “uel “6S 
‘aT qrussodut ST UOTIENYIS JU] ‘Aguinof uinyed 
ay} IOJ eae St OU ‘Tenba aie (9/*Z 
UO ‘SeTIW x ST SUMOZ UVEMJeq BOURISTpP 919 FT “gS 
‘Aeyd nok yoea [Op 
nod os ut Jequnu Tenbe ysnu anoA ul spieo 
JO Jaqunu ‘ported oie spied [eIyneu NOK “ZS 
‘1OTOO awes ayy FO JOU UBSOYD OM} YSITF OYA FT “9S 


953 | 
ted 
hen 
26, 
‘ 

| 
6? 
nbs 

“6h 
6 

369 2 
1p 

ay) 

ole 
- 
M0] 

uod 
06 

ay) 

ay) 
- ud 
68 
~4 [I- 
| 


MATHEMATICS MAGAZINE 


FALSIES 


A falsie is a problem for which a correct solution is obtained by 
illegal operations, or an incorrect result is secured by apparently legal 
processes. For each of the following falsies, can your offer an explana. 
tion? Send in your favorite falsies. 


F 7. A student solved the equation 7 sin A = 3 as follows: 
3 
A= -—- = sin '_. [Submitted by Norman Anning. ] 
7 sin 7 


F8. The value of the expression 

2 2 
- x3 + - + 1 
4 x+ 1 


) 


+ x + 1 


will not be changed if we suppress the two fractions. (M. Kraitchik, 
Mathematical Recreations, Norton (1942), page 42.) 


F 9. Here is a student’s method of proving one identity: 


sin 66+ sin 20 sin (60 + 26) sin 40 


= — = —— Fit 40. 
cos 66 + cos 26 cos % (60 + 26) cos 40 - 


[Submitted by J. M. Howell.) 


F 10. Three coins are tossed at once. We can say with assurance that of 
the three coins tossed, two of them must come down alike - both heads or 
both tails. What of the third coin? The probability that it is heads is 
%. that it is tails, also %. In either case the probability that it is 
the same as the other two is %. Consequently the probability that all 
three are alike is 4%. [E. P. Northrop, Riddles in Mathematics, Van 
Nostrand (1944), page 172.] 
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Solutions 


The page on which a solution appears is in the parentheses following the number 
of the problem. 


114 (48), 115 (49), 116 (50), 117 (104), 119 (50), 120 (51), 121 (107), 122 (108), 
123 (108), 124 (109), 125 (109), 126 (111), 127 (111), 128 (158), 129 (159), 132 
(159), 133 (160), 134 (162), 135 (164), 136 (164), 137 (165), 138 (166), 139 (216), 
140 (217), 141 (218), 142 (219), 143 (220), 144 (220), 145 (221), 146 (222), 141 
(278), 148 (279), 149°(282), 150 (283), 151 (284), 152 (286). 


Quickies 


Q 15 (115); Q 28 (287); Q 51 (169); Q 57 (287); Q 67, 68 (53); Q 69 (54); Q70(54, 
115); Q71, 72, 73, B4 (iit); © 75, 76, 77, 78, 79, 80 (169); Q 81, &2, 83, 84, 
85 (225); Q 86, 87 (226); Q 88, 89, 90, 91, 92 (287), Q 93, 94 (288). 


Trickies 


74a, 2 (167): T 3, 4, 5 (966); T 6, 7, 8, 9 (3B9). 
Falsies 


10 (290). 
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